Claude Code Session Management & 1M Context: The Complete Engineer's Guide [2026]
Why Session Management Is Now a Core Engineering Skill
Claude Code's 1 million token context window is the largest available in any production coding assistant. But a large context window is a double-edged capability. Use it correctly and you get a model that retains your entire codebase, all relevant tool outputs, and hours of debugging history. Use it naively and you hit context rot — the point where performance degrades because attention is spread across too many tokens, with older irrelevant content actively distracting the model from the current task.
Anthropic describes context rot explicitly: as a session grows, "attention gets spread across more tokens, and older, irrelevant content starts to distract." Understanding how to manage this is now as important as understanding how to write good prompts.
The Core Mental Model
Before every session branch decision, ask: "Will I need this tool output again, or just the conclusion?" If you only need the conclusion — compact or delegate to a subagent. If you need the raw output — keep the context alive.
The Five Session Tools
Claude Code gives you five branching options after completing any task. Each targets a specific context health scenario:
| Tool | Command | When to Use | Context Preserved |
|---|---|---|---|
| Continue | Send next message | Same task, all context still relevant | Full |
| Rewind | /rewind or Esc+Esc |
Wrong approach taken; retry from prior state | File reads kept, failed attempt dropped |
| Compact | /compact [hint] |
Bloated session with stale exploration | Claude-driven summary |
| Clear | /clear |
Entirely new task | None — full reset |
| Subagents | Delegate in prompt | High intermediate output, need only conclusion | Only result returns to parent |
Understanding Context Rot
Context rot is subtle. It doesn't announce itself with an error — the model simply starts making worse decisions. Common symptoms:
- The model revisits approaches it already tried and failed
- Responses reference stale file states that have since been edited
- Tool call accuracy degrades — schemas get confused with earlier ones
- The model spends tokens summarizing history instead of solving the current problem
The 1M token window means you can go a long time before hitting the hard limit — but context rot typically sets in much earlier, often around 100–200k tokens of mixed tool output and exploratory reasoning.
/compact: Proactive Summarization
The /compact command triggers Claude-driven summarization of the current session. The model distills what it's learned into a dense brief, then continues from that brief instead of the full history.
Two modes:
- Plain
/compact— Claude decides what's load-bearing. Works well for straightforward tasks where the model can infer what matters. /compact <hint>— You specify what to preserve. Critical for complex tasks where the model might summarize away context you'll need. Example:/compact preserve: the auth middleware refactor decisions and the failing test root cause
Autocompaction Warning
Claude Code's automatic compaction (triggered near context limits) can fail when the model can't predict the direction of remaining work. Always use proactive /compact with explicit hints rather than waiting for autocompaction on complex tasks. Autocompaction is a safety net, not a strategy.
/rewind: Surgical Context Rollback
The /rewind command (also triggered by Esc+Esc) jumps back to a previous message state. Critically, it is not a simple undo — it preserves file reads and accumulated learnings while dropping the failed attempt that followed them.
This is the right tool when:
- You asked the model to implement an approach and it went down the wrong path
- A tool call chain failed halfway through and corrupted intermediate state
- You want to retry with different instructions while keeping the research context
Use /rewind instead of manually restating all previous context in a new message — it's faster and preserves the exact state more reliably.
Subagents: Isolating Context Noise
Subagents are the most powerful but least understood session tool. When you delegate work to a subagent, the child agent receives a fresh, clean context window. Only the result (not the intermediate reasoning, tool calls, or exploratory output) returns to the parent session.
This is transformative for tasks that generate enormous intermediate output:
# Instead of running a full codebase audit inline (flooding parent context):
"Spawn a subagent to audit all API endpoints in /src/routes/ for
missing input validation. Return only: a list of affected files
and a one-line description of each gap. Do not return raw code."
The subagent does thousands of tokens of analysis. Your parent session receives a clean 20-line summary. Context stays healthy.
Subagents are also the right pattern when you need parallel delegation — multiple independent tasks that don't need to share context with each other. Explicitly instruct parallel delegation when you need it; Opus 4.7 is more conservative about spawning subagents autonomously than 4.6 was.
/usage: Session Visibility
The new /usage command gives you real-time visibility into your session's token consumption. Use it proactively to monitor context health before rot sets in — not after you've already noticed degradation. Running /usage before a major new task helps you decide whether to /compact first or continue.
Decision Framework: Which Tool to Use
Use Continue when:
All context in the session is still directly relevant to the next task. You haven't accumulated significant stale exploration. Your next task is a direct follow-on to the current one.
Use /rewind when:
The model went down the wrong path after a specific message. You want to retry with different instructions. The intermediate tool calls created confusion but the pre-attempt context was correct.
Use /compact [hint] when:
The session has grown large but you want to continue working on the same problem. You've finished a phase of work and want to start the next phase without the noise. Always provide a hint that specifies what must survive the summarization.
Use /clear when:
You're starting a completely different task. The current session context would actively mislead the model on the new task. You want a guaranteed clean state — not a Claude-summarized one.
Use Subagents when:
The task will generate high intermediate output that you don't need in your parent context. You need parallel execution of independent tasks. The work is self-contained — the subagent doesn't need access to your current session's history.
For more hands-on work with structured code outputs between sessions, our Code Formatter helps keep inter-session payloads clean and consistently parseable when using file-system memory patterns.
Get Engineering Deep-Dives in Your Inbox
Weekly breakdowns of AI developer tooling, architecture, and workflows — no fluff.