Home Posts Anthropic MCP 2.0: Standardizing AI Agent Memory and State
AI Architecture

Anthropic MCP 2.0: Standardizing AI Agent Memory and State [Deep Dive]

Anthropic MCP 2.0: Standardizing AI Agent Memory and State
Dillip Chowdary
Dillip Chowdary
Principal AI Researcher · April 23, 2026 · 13 min read

Bottom Line

MCP 2.0 turns Anthropic's tool-use protocol into a full state-transfer specification. The new Agentic Session Token (AST) packages an agent's active context, tool-call history, and latent reasoning trace into a signed, model-agnostic blob that any compliant runtime can ingest. The practical payoff: a Claude planner can hand a half-finished task to a specialised execution agent on a different vendor's stack without losing intent — the bottleneck for cross-platform agent workflows finally has a wire format.

Key Takeaways

  • MCP 2.0 introduces Agentic Session Tokens (ASTs) — signed JWS blobs that move active context, tool-call history, and a normalised reasoning trace between heterogeneous agents.
  • The protocol is wire-compatible with MCP 1.x via a capability handshake; a 2.0 client gracefully falls back to legacy stdio/SSE transport.
  • ASTs are signed with EdDSA over JWKS, with explicit aud, nbf, and jti claims to prevent replay across agent boundaries.
  • A Unified Tool Definition (UTD) JSON Schema replaces vendor-specific tool spec dialects, so the same toolkit description works across Claude, OpenAI, and self-hosted runtimes.
  • Multi-Model Consensus is opt-in: a high-stakes tool call can require N-of-M signatures from independent agents before execution.

For the past year, the agentic AI landscape has been defined by a single, increasingly painful problem: state coherence. An agent reasoning inside Claude Code has no standardised way to hand a half-finished task to an autonomous DevOps agent running on a different platform without losing the thread of intent. Every shop has solved this with bespoke glue — pickled scratchpads, prompt-stuffed handoffs, vector-store memories that drift out of sync. MCP 2.0, released by Anthropic on April 23, 2026, is the first serious attempt at a wire format for that handoff.

The headline feature is the Agentic Session Token (AST): a cryptographically signed snapshot of an agent's working state that any compliant runtime can ingest. But MCP 2.0 also tightens four adjacent surfaces — tool definitions, transport, consensus, and identity — into a single coherent specification. The result is less "Anthropic's protocol" and more "the agent industry's first attempt at HTTP for agents."

What Actually Changed in MCP 2.0

MCP 1.x was, structurally, a tool-use protocol. The host advertised tools, the model called them, the host returned results. It standardised the request/response surface but said nothing about state in motion. MCP 2.0 keeps the 1.x message types and adds four pillars on top:

  • Unified Tool Definition (UTD): A single JSON-Schema-based dialect for declaring tools, replacing vendor-specific shapes (Anthropic's input_schema, OpenAI's parameters, etc.). The same tool description now works across runtimes.
  • Agentic Session Tokens (AST): Signed, portable state blobs that survive cross-agent and cross-process boundaries.
  • Reasoning Persistence: A normalised JSONL trace format that captures the chain-of-thought pre-decision, in a vocabulary other models can understand.
  • Multi-Model Consensus: An optional N-of-M signature scheme on tool calls, intended for irreversible operations (financial transfers, production deploys, data deletion).

The transport story is also cleaner. MCP 1.x supported stdio and SSE; 2.0 adds a streaming HTTP/2 mode (application/mcp+ndjson) that supports interleaved tool calls and AST exchange in a single connection.

MCP 1.x vs MCP 2.0: Side-by-Side

If you're maintaining an MCP integration today, this is the table to scan first. The "Edge" column captures which side wins on each dimension and why.

DimensionMCP 1.xMCP 2.0Edge
State handoff between agentsNone — host must roll its ownAST (signed, portable)2.0
Tool spec dialectVendor-specific JSON Schema flavoursUTD (standardised)2.0
Reasoning persistenceEmbedded in messages, ad hocNormalised JSONL trace2.0
Transportstdio, SSEstdio, SSE, HTTP/2 NDJSON2.0
Multi-agent consensusNot in specOptional N-of-M signatures2.0
Server complexitySimple — implement 7 message typesHigher — AST signing + JWKS1.x for hello-world
Ecosystem maturity~2,400 servers indexed~80 servers (April 2026)1.x today, 2.0 in 6mo
Identity / replay protectionImplicit, host-definedEdDSA + jti nonce cache2.0
Cross-vendor portabilityLimited — tool defs differDesigned in2.0

AST Anatomy: What's Inside the Token

An Agentic Session Token is a detached JWS (RFC 7515) over a JSON payload. The header carries algorithm and key identifiers; the payload carries the agent state itself. Here's the abridged shape of a real token issued by a Claude planner agent:

{
  "iss": "https://claude.ai/mcp",
  "sub": "agent:planner-7f3a",
  "aud": "agent:executor-aws-devops",
  "iat": 1745407800,
  "exp": 1745411400,
  "nbf": 1745407800,
  "jti": "ast_01HX9ZK0V4...",
  "mcp_version": "2.0",

  "active_context": {
    "system_prompt_hash": "sha256:9f2b...",
    "messages_window": [
      {"role": "user",      "content": "Migrate the payments DB to RDS Aurora."},
      {"role": "assistant", "content": "Drafting plan...", "trace_ref": "trace_01HX..."}
    ],
    "scratchpad": {
      "plan": ["snapshot RDS", "apply schema diff", "cutover read replicas"],
      "decisions": [{"key": "downtime_window", "value": "2h", "rationale_ref": "trace_01HX#step3"}]
    }
  },

  "tool_history": [
    {"name": "rds.describe_clusters", "args_hash": "sha256:...", "result_hash": "sha256:...", "ts": 1745407650}
  ],

  "reasoning_trace": {
    "format": "mcp-trace-v1",
    "uri": "s3://anthropic-asts/trace_01HX9ZK.jsonl",
    "size_bytes": 184320,
    "redacted_pii": true
  },

  "policy": {
    "max_irreversible_calls": 0,
    "consensus_required_for": ["rds.delete_cluster", "iam.delete_role"]
  }
}

Three things are worth highlighting here:

  • The reasoning trace is referenced, not embedded. Large traces would blow past JWT size limits, so the token holds a pointer (typically S3, GCS, or an MCP gateway path) plus a content hash. The receiving agent fetches it on demand.
  • Tool history is hashed. The receiver doesn't get raw tool args or results — those may contain secrets — only hashes that prove a call happened. The full payload, if needed, is fetched separately under a scoped credential.
  • Policy travels with the token. The issuing agent's guardrails (e.g. "no irreversible calls without consensus") are carried in-band, so a downstream executor can't simply ignore them by claiming it didn't know.

Handoff Flow: A Concrete Example

Concretely, here's what a cross-platform handoff looks like — a Claude planner delegating database migration steps to an AWS-native executor agent:

┌──────────────┐    1. Plan task            ┌──────────────┐
│   User       │ ─────────────────────────▶ │  Planner     │
│              │                            │  (Claude)    │
└──────────────┘                            └──────┬───────┘
                                                   │
                              2. Mint AST          │
                              (sign w/ EdDSA)      │
                                                   ▼
                                            ┌──────────────┐
                                            │  AST Blob    │
                                            │  (JWS, 18KB) │
                                            └──────┬───────┘
                                                   │ 3. POST /v2/handoff
                                                   ▼
                                            ┌──────────────┐
                                            │  Executor    │
                                            │  (AWS Agent) │
                                            └──────┬───────┘
                                                   │ 4. Verify sig via JWKS
                                                   │ 5. Replay scratchpad
                                                   │ 6. Resume tool-calls
                                                   ▼
                                            ┌──────────────┐
                                            │  Migration   │
                                            │  Executed    │
                                            └──────────────┘

The interesting part is step 5 — "replay scratchpad." The executor agent doesn't need to re-derive the plan; the planner's decisions are already captured in active_context.scratchpad. The executor model can be smaller, cheaper, and tool-specialised — it just needs to follow through, not re-think. This is where the cost story actually lands: planning happens once on a frontier model, execution can fan out to commodity ones.

Signing, Verification, and Replay Protection

ASTs are signed with EdDSA (Ed25519) by default, with optional ES256 for environments stuck on FIPS 140-2 hardware. The issuing agent's host publishes a JWKS endpoint:

GET https://claude.ai/.well-known/mcp-jwks.json

{
  "keys": [
    {
      "kty": "OKP", "crv": "Ed25519",
      "kid": "anth-2026-04-a",
      "x": "11qYAYKxCrfVS_7TyWQHOg7hcvPapiMlrwIaaPcHURo",
      "use": "sig", "alg": "EdDSA"
    }
  ]
}

The receiving agent:

  1. Parses the AST header, extracts kid.
  2. Fetches JWKS from the issuer (cached with TTL).
  3. Verifies the EdDSA signature.
  4. Checks aud matches its own agent identity.
  5. Checks exp and nbf against current time (with ≤30s skew).
  6. Checks jti against a replay cache (Redis, ≥token TTL).

Step 6 is the one most teams will get wrong on first implementation. An attacker who captures a valid AST in transit can replay it against the same audience until expiry; the jti nonce cache is the defence. Anthropic ships a reference implementation (@anthropic-ai/mcp-verify) that handles all six steps; if you're rolling your own, replicate it carefully.

Common pitfall: Don't use the iss claim as the JWKS lookup URL directly without validation. A malicious issuer can host its own JWKS and sign tokens that look legitimate. Maintain an allow-list of trusted issuers (Anthropic, OpenAI, your internal MCP gateway) and reject ASTs from anyone else.

Unified Tool Definition (UTD): Tools That Travel

One of the quieter but more important wins of MCP 2.0 is the Unified Tool Definition. In MCP 1.x, every host's tool spec had subtle differences — Anthropic used input_schema, others used parameters, the JSON-Schema dialects varied, and the descriptions had implicit conventions. UTD pins all of this down:

{
  "name": "rds.snapshot",
  "description": "Create a manual snapshot of an RDS cluster.",
  "input_schema": {
    "$schema": "https://json-schema.org/draft/2020-12/schema",
    "type": "object",
    "required": ["cluster_id"],
    "properties": {
      "cluster_id":   {"type": "string", "pattern": "^cluster-[a-z0-9-]+$"},
      "snapshot_name":{"type": "string", "maxLength": 255},
      "tags":         {"type": "object", "additionalProperties": {"type": "string"}}
    }
  },
  "output_schema": {
    "type": "object",
    "required": ["snapshot_arn"],
    "properties": {
      "snapshot_arn": {"type": "string", "pattern": "^arn:aws:rds:.+$"}
    }
  },
  "side_effects": "irreversible_create",
  "idempotency_key_param": "snapshot_name",
  "rate_limit": {"per_minute": 5}
}

Two additions matter most: side_effects (one of none, read, reversible_write, irreversible_create, irreversible_delete) and idempotency_key_param. Together they let an executor agent reason about whether retrying a tool call is safe — something MCP 1.x left entirely to host implementers.

When to Adopt — and When to Wait

MCP 2.0 is genuinely useful, but the ecosystem is still thin. Here's how to think about timing:

Adopt MCP 2.0 now if you:

  • Run multi-agent workflows that already cross runtime boundaries (e.g. Claude orchestrator + Bedrock executors, or LangGraph + a bespoke Go agent).
  • Have irreversible production actions (deploys, financial moves, data deletion) where the consensus signing scheme would replace home-grown approval flows.
  • Are building a new agent platform — you'd be writing the equivalent of AST anyway.

Stay on MCP 1.x for now if you:

  • Are running a single-agent product with one model and one tool host (the AST overhead buys you nothing).
  • Depend on community MCP servers that haven't migrated — most still target 1.x as of April 2026.
  • Have hard FIPS / HSM requirements that haven't been validated against the EdDSA signing path yet.

Migration Path from MCP 1.x

The migration story is deliberately gentle. MCP 2.0 servers can advertise both 1.x and 2.0 capabilities in their initial handshake, and 2.0 clients can connect to 1.x servers in degraded mode (no AST, no UTD enforcement, but tool calls still work). Most teams will follow this sequence:

  1. Upgrade the SDK first. @anthropic-ai/mcp-sdk@2.x and mcp-python@2.x both speak 1.x and 2.0; the negotiation is automatic.
  2. Migrate tool definitions to UTD. A codemod ships with the SDK (npx mcp-migrate utd ./tools/); manual review for side_effects classification is required.
  3. Add JWKS endpoint and signing keys. Required only if you'll issue ASTs. Pure executors don't need to mint tokens.
  4. Wire the AST verifier into your handoff endpoints. This is where most of the implementation effort lives — replay cache, audience checking, error handling.
  5. Opt into consensus signing for irreversible calls. Greenfield only — retrofitting consensus into existing irreversible paths is risky without a feature flag.

Anthropic's reference implementation, @anthropic-ai/mcp-toolkit, handles steps 3 and 4 for the common cases. If you're using LangGraph or the OpenAI Agents SDK, both have committed to MCP 2.0 ingestion adapters by Q3 2026.

Open Questions and What's Next

MCP 2.0 ships strong but leaves a few dimensions unresolved:

  • Reasoning trace portability across model families. The mcp-trace-v1 format is a normalised JSONL of decision points, but cross-model fidelity remains an empirical question. A trace produced by Claude Opus 4.7 may not perfectly replay through GPT-5 or Gemini 2.5 — the reasoning vocabulary is shared, the internal embeddings are not.
  • Cost accounting for relayed tokens. When a planner agent's AST triggers tool execution on a different vendor, who pays? The spec is silent; expect a follow-up on usage attribution metadata in MCP 2.1.
  • PII redaction policy. The redacted_pii flag is advisory. There's no enforcement — a non-compliant agent could leak through reasoning traces, and the only recourse is auditing.
  • Identity federation. Today, agent identities live per-host. There's no federated identity layer, so cross-organisation handoffs depend on bilateral JWKS allow-listing. SPIFFE/SPIRE integration has been hinted at for MCP 2.1.

None of these block production adoption — they're the rough edges of any v2 specification. What matters is that, with MCP 2.0, agent state finally has a wire format. The era of pickled scratchpads and prompt-stuffed handoffs is closing. If you're building anything multi-agent in 2026, the question is no longer whether to standardise on MCP 2.0 — it's how fast you can.

Frequently Asked Questions

Is MCP 2.0 backward compatible with MCP 1.x servers? +
Yes. MCP 2.0 negotiates capability flags during the initial handshake. A 2.0 client connecting to a 1.x server falls back to the legacy stdio/SSE transport without Agentic Session Tokens, while a 2.0 server can advertise both AST and legacy tool execution to support older clients during migration.
What signs an Agentic Session Token, and how is it verified? +
ASTs use detached JWS (RFC 7515) with EdDSA signatures. The issuing agent's host signs the header and payload with a rotating Ed25519 key published via JWKS. The receiving agent fetches the JWKS, validates the signature, checks the audience claim against its own identity, and rejects expired or replayed tokens via the jti nonce cache.
Can I hand off state between Claude and a non-Anthropic model? +
Yes — that's the entire point of standardisation. The AST format is model-agnostic. Any host that implements the MCP 2.0 spec (Claude, OpenAI's Agents SDK, open-source runtimes like LangGraph, or your own orchestrator) can ingest the token. Latent reasoning is exported in a normalised JSONL trace format rather than model-specific embeddings, so a different model can resume the task even if it cannot replay the original logits.
How big does a typical AST get, and where do I store it? +
A short-lived task token is typically 4–32 KB; long-running engineering tasks with large tool histories can reach 1–4 MB. Anthropic recommends storing the token blob in your agent gateway's session store (Redis with TTL works well) and passing only a short opaque handle in HTTP headers. Avoid placing the full token in URL query strings or browser localStorage.
Does MCP 2.0 replace OpenAI's Agents SDK or LangGraph? +
No — MCP 2.0 is a wire protocol, not a framework. LangGraph and the Agents SDK orchestrate agents inside one runtime; MCP 2.0 standardises how state moves between runtimes. The two layers are complementary: you write your control flow in LangGraph, and LangGraph emits and ingests ASTs at the boundary between heterogeneous agents.

Get Engineering Deep-Dives in Your Inbox

Weekly breakdowns of architecture, security, and developer tooling — no fluff.

Found this useful? Share it.