Context Window
The maximum number of tokens an LLM can process in a single interaction, determining how much information it can consider when generating responses.
Also known as: Context Length, Token Limit, Context Size
Category: Principles
Tags: ai, llm, tokens, memories, attention
Explanation
The context window is a fundamental characteristic of Large Language Models that defines how much text the model can 'see' at once. It encompasses both the input (your prompt, conversation history, documents) and the output (the model's response).
Context window sizes vary significantly:
- GPT-4o: 128K tokens
- Claude 3.5 Sonnet: 200K tokens
- Gemini 1.5 Pro: 1M tokens
Why context window matters:
- **More context = better understanding**: Larger windows allow models to consider more information when making predictions
- **Conversation memory**: Determines how much chat history the model remembers
- **Document processing**: Limits how much text can be analyzed at once
- **Attention mechanism**: The model weighs all tokens in the context when generating each new token
Practical implications:
- Long documents may need to be chunked or summarized
- Conversation history must be managed to stay within limits
- RAG systems help work around context limitations
- AI Mega Prompts work best with larger context windows
A token is roughly 4 characters or 0.75 words in English, so 128K tokens is approximately 96,000 words - about the length of a novel.
As context windows grow, new use cases emerge: analyzing entire codebases, processing book-length documents, and maintaining extended conversations with full memory.
Related Concepts
← Back to all concepts