Claude Opus 4.6: 1 Million Token Context & Adaptive Thinking Deep-Dive

Sunday, February 15, 2026 — Anthropic has officially released Claude Opus 4.6, a generational leap in reasoning depth and context management. While previous models struggled with "context rot" at large scales, Opus 4.6 introduces a qualitative shift in how AI handles massive codebases and long-running agentic tasks.

The 1 Million Token Barrier

For the first time, developers have access to a 1 Million Token Context Window in beta. This isn't just a marketing metric; Anthropic's MRCR v2 (Multi-Round Context Retrieval) benchmarks show that Opus 4.6 maintains 76% retrieval accuracy at 1M tokens, a massive improvement over Sonnet 4.5's 18.5% at the same scale.

Retrieval Benchmarks (MRCR v2)

Scale: 256K 93% Accuracy
Scale: 512K 88% Accuracy
Scale: 1M (Beta) 76% Accuracy

Adaptive Thinking: Dynamic Reasoning

Opus 4.6 replaces "extended thinking" with Adaptive Thinking. This mode allows the model to dynamically decide when to use chain-of-thought reasoning based on the complexity of the query. Developers can now control the reasoning effort via the API:

Low / Medium

Optimized for cost and speed in routine tasks.

High / Max

Unlocks full reasoning depth for R&D and debugging.

Agentic Features: Context Compaction

One of the most requested features for AI agents is now in beta: Context Compaction. This feature automatically summarizes and replaces older context when a conversation approaches a configurable threshold. This prevents agent loops from hitting the token limit while maintaining the "gist" of the mission history.

Technical Breakdown & Pricing

Feature	Specification
Max Output Tokens	128,000 tokens
Input Price (Standard)	$5.00 / 1M tokens
Input Price (>200K Context)	$10.00 / 1M tokens (Premium Tier)
Output Price	$25.00 / 1M tokens

Breaking Changes: Deprecation Alert

Developers migrating from Opus 4.5 should be aware of two critical breaking changes:

Response Prefilling Removed: Attempting to prefill the assistant's response now returns a 400 Bad Request error.
budget_tokens Deprecated: The `budget_tokens` parameter has been replaced by the new Adaptive Thinking effort levels.

Pro Tip: Using 1M Context

To use the 1M token context beta, you must enable the anthropic-beta: context-compaction-2026-02 header in your API requests. Note that inference latency scales linearly with context size; expect up to 60-second "Time to First Token" (TTFT) for full 1M token saturation.

Internal Integration: Use ByteNotes to organize your technical documentation and long-form research drafts before passing them to Claude Opus 4.6 for deep analysis.

Sources: Anthropic Official Release Notes | Benchmarks: MRCR v2 Technical Whitepaper