Home Posts Claude Code Session Management & 1M Context Guide
Developer Tools

Claude Code Session Management & 1M Context: The Complete Engineer's Guide [2026]

Dillip Chowdary
Dillip Chowdary
Tech Entrepreneur & Innovator · April 16, 2026 · 9 min read

Why Session Management Is Now a Core Engineering Skill

Claude Code's 1 million token context window is the largest available in any production coding assistant. But a large context window is a double-edged capability. Use it correctly and you get a model that retains your entire codebase, all relevant tool outputs, and hours of debugging history. Use it naively and you hit context rot — the point where performance degrades because attention is spread across too many tokens, with older irrelevant content actively distracting the model from the current task.

Anthropic describes context rot explicitly: as a session grows, "attention gets spread across more tokens, and older, irrelevant content starts to distract." Understanding how to manage this is now as important as understanding how to write good prompts.

The Core Mental Model

Before every session branch decision, ask: "Will I need this tool output again, or just the conclusion?" If you only need the conclusion — compact or delegate to a subagent. If you need the raw output — keep the context alive.

The Five Session Tools

Claude Code gives you five branching options after completing any task. Each targets a specific context health scenario:

Tool Command When to Use Context Preserved
Continue Send next message Same task, all context still relevant Full
Rewind /rewind or Esc+Esc Wrong approach taken; retry from prior state File reads kept, failed attempt dropped
Compact /compact [hint] Bloated session with stale exploration Claude-driven summary
Clear /clear Entirely new task None — full reset
Subagents Delegate in prompt High intermediate output, need only conclusion Only result returns to parent

Understanding Context Rot

Context rot is subtle. It doesn't announce itself with an error — the model simply starts making worse decisions. Common symptoms:

  • The model revisits approaches it already tried and failed
  • Responses reference stale file states that have since been edited
  • Tool call accuracy degrades — schemas get confused with earlier ones
  • The model spends tokens summarizing history instead of solving the current problem

The 1M token window means you can go a long time before hitting the hard limit — but context rot typically sets in much earlier, often around 100–200k tokens of mixed tool output and exploratory reasoning.

/compact: Proactive Summarization

The /compact command triggers Claude-driven summarization of the current session. The model distills what it's learned into a dense brief, then continues from that brief instead of the full history.

Two modes:

  • Plain /compact — Claude decides what's load-bearing. Works well for straightforward tasks where the model can infer what matters.
  • /compact <hint> — You specify what to preserve. Critical for complex tasks where the model might summarize away context you'll need. Example: /compact preserve: the auth middleware refactor decisions and the failing test root cause

Autocompaction Warning

Claude Code's automatic compaction (triggered near context limits) can fail when the model can't predict the direction of remaining work. Always use proactive /compact with explicit hints rather than waiting for autocompaction on complex tasks. Autocompaction is a safety net, not a strategy.

/rewind: Surgical Context Rollback

The /rewind command (also triggered by Esc+Esc) jumps back to a previous message state. Critically, it is not a simple undo — it preserves file reads and accumulated learnings while dropping the failed attempt that followed them.

This is the right tool when:

  • You asked the model to implement an approach and it went down the wrong path
  • A tool call chain failed halfway through and corrupted intermediate state
  • You want to retry with different instructions while keeping the research context

Use /rewind instead of manually restating all previous context in a new message — it's faster and preserves the exact state more reliably.

Subagents: Isolating Context Noise

Subagents are the most powerful but least understood session tool. When you delegate work to a subagent, the child agent receives a fresh, clean context window. Only the result (not the intermediate reasoning, tool calls, or exploratory output) returns to the parent session.

This is transformative for tasks that generate enormous intermediate output:

# Instead of running a full codebase audit inline (flooding parent context):

"Spawn a subagent to audit all API endpoints in /src/routes/ for
missing input validation. Return only: a list of affected files
and a one-line description of each gap. Do not return raw code."

The subagent does thousands of tokens of analysis. Your parent session receives a clean 20-line summary. Context stays healthy.

Subagents are also the right pattern when you need parallel delegation — multiple independent tasks that don't need to share context with each other. Explicitly instruct parallel delegation when you need it; Opus 4.7 is more conservative about spawning subagents autonomously than 4.6 was.

/usage: Session Visibility

The new /usage command gives you real-time visibility into your session's token consumption. Use it proactively to monitor context health before rot sets in — not after you've already noticed degradation. Running /usage before a major new task helps you decide whether to /compact first or continue.

Decision Framework: Which Tool to Use

Use Continue when:

All context in the session is still directly relevant to the next task. You haven't accumulated significant stale exploration. Your next task is a direct follow-on to the current one.

Use /rewind when:

The model went down the wrong path after a specific message. You want to retry with different instructions. The intermediate tool calls created confusion but the pre-attempt context was correct.

Use /compact [hint] when:

The session has grown large but you want to continue working on the same problem. You've finished a phase of work and want to start the next phase without the noise. Always provide a hint that specifies what must survive the summarization.

Use /clear when:

You're starting a completely different task. The current session context would actively mislead the model on the new task. You want a guaranteed clean state — not a Claude-summarized one.

Use Subagents when:

The task will generate high intermediate output that you don't need in your parent context. You need parallel execution of independent tasks. The work is self-contained — the subagent doesn't need access to your current session's history.

For more hands-on work with structured code outputs between sessions, our Code Formatter helps keep inter-session payloads clean and consistently parseable when using file-system memory patterns.

Get Engineering Deep-Dives in Your Inbox

Weekly breakdowns of AI developer tooling, architecture, and workflows — no fluff.