Technology
context windowing
Context windowing defines the total volume of tokens an LLM can process in a single prompt to maintain coherence and recall.
Context windowing is the operational memory limit of a Large Language Model (LLM). It dictates how much information (text, code, or documentation) the system can ingest before it begins dropping earlier data. While early models like GPT-3.5 were capped at 4,096 tokens, modern frontier models like Claude 3.5 Sonnet and Gemini 1.5 Pro have expanded this to 200,000 and 2,000,000 tokens respectively. This expansion allows developers to feed entire codebases or thousand page technical manuals directly into the prompt. Effective windowing relies on high 'needle in a haystack' retrieval rates to ensure the model accurately recalls specific facts buried deep within massive datasets.
Recent Talks & Demos
Showing 1-0 of 0