---
title: "From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs"
date: 2026-03-30
source: https://machinelearningmastery.com/from-prompt-to-prediction-understanding-prefill-decode-and-the-kv-cache-in-llms/
description: "This article is divided into three parts; they are: • How Attention Works During Prefill • The Decode Phase of LLM Inference • KV Cache: How to Make Decode More Efficient Consider the prompt: Today’s weather is so ."
---

# From Prompt to Prediction: Understanding Prefill, Decode, and the KV Cache in LLMs

This article is divided into three parts; they are: • How Attention Works During Prefill • The Decode Phase of LLM Inference • KV Cache: How to Make Decode More Efficient Consider the prompt: Today’s weather is so .

*Published: 2026-03-30*
