Home / Deep Dives / Cloudflare Dynamic Workers
CLOUDFLARE OPEN BETA 100x FASTER

Cloudflare Dynamic Workers Open Beta: AI Agent Code Execution 100x Faster Than Containers, 80% Fewer Inference Tokens

Dillip Chowdary
Dillip Chowdary
Tech Entrepreneur & Innovator · March 28, 2026

Top Highlights

  • Dynamic Workers open beta March 24, 2026 — all paid Workers users; a Worker can spawn other Workers at runtime in isolated V8 sandboxes
  • 100x faster startup than Linux containers, 10–100x more memory efficient; starts in milliseconds with a few MB of RAM
  • Code Mode saves 80%+ inference tokens — LLMs write and execute code instead of chaining individual tool calls, converting an MCP server to TypeScript API cut token usage 81%
  • Three new npm packages: @cloudflare/codemode, @cloudflare/worker-bundler, @cloudflare/shell (virtual filesystem for agents)
  • RFC 9457 JSON error payloads now returned to AI agents, replacing HTML error pages — structured, machine-readable, actionable

What Are Dynamic Workers?

Dynamic Workers is a new Cloudflare Workers capability that allows a Worker to programmatically create and execute other Workers at runtime. Each spawned Worker runs in an isolated V8 isolate — the same sandboxed JavaScript runtime that powers all Cloudflare Workers — but the key difference is that the code being executed is determined at request time, not at deploy time. You can pass arbitrary code strings to a Dynamic Worker and execute them safely, without pre-deploying them.

This unlocks the core use case for AI agents: running LLM-generated code in a safe, isolated environment. When a coding agent writes a data transformation script, a Dynamic Worker can execute it immediately without human review — the V8 sandbox ensures it cannot access the host system, exfiltrate data, or persist anything beyond the session lifetime. The parent Worker retains full control over what resources the dynamic Worker can access through explicitly declared typed interfaces.

Compared to the alternative — spinning up a Linux container for each agent code execution — Dynamic Workers eliminate a cold-start penalty that was previously measured in seconds. V8 isolate startup is measured in single-digit milliseconds, memory footprint is a few megabytes rather than hundreds, and the security model is the same battle-tested V8 sandbox Cloudflare has run at global scale since 2017.

Security Architecture: V8 Isolates + Second-Layer Sandbox

Dynamic Workers uses a two-layer sandboxing model. The outer layer is V8's built-in isolate boundary — JavaScript code running in one isolate cannot directly access memory from another. The inner layer is a custom second-layer sandbox that Cloudflare built specifically for Dynamic Workers, adding:

  • Memory Protection Keys (MPK): Hardware-enforced memory isolation using Intel MPK / ARM memory tagging. Even a V8 sandbox escape — which would be a critical V8 bug — would hit the MPK boundary before accessing host memory.
  • Dynamic tenant cordoning: Each Dynamic Worker execution is assessed for risk (based on code structure, resource requests, and runtime behaviour), and higher-risk executions are routed to more isolated infrastructure automatically.
  • Spectre mitigations: Novel Spectre defences co-developed with academic researchers, preventing timing side-channel attacks from leaking data across isolate boundaries.

The Cap'n Web RPC bridge is how the dynamic Worker communicates back to the parent. Rather than message-passing over a raw channel, the parent exposes typed interfaces over Cap'n Proto — a high-performance serialisation format. The dynamic Worker calls these typed interfaces as if they were local functions, while the RPC bridge enforces that only declared interface methods are callable across the security boundary.

// Parent Worker: declare typed interface for dynamic Workers to call
// parent-worker.ts
import { DynamicWorker } from '@cloudflare/workers-sandbox';

export default {
  async fetch(request: Request, env: Env): Promise<Response> {
    const code = await request.text(); // LLM-generated code string

    // Spawn a Dynamic Worker with restricted interface access
    const sandbox = await DynamicWorker.create(code, {
      // Only expose these typed capabilities to the dynamic worker
      allowedBindings: {
        kv: env.RESULTS_KV,       // KV write for results
        fetch: fetchWithAllowlist  // fetch() only to approved origins
      },
      timeout: 30_000,  // 30s max execution
      memory: 128       // 128MB max
    });

    const result = await sandbox.run({ input: await getInputData() });
    return Response.json(result);
  }
};

Code Mode: 80% Fewer Inference Tokens

The most economically significant feature of Dynamic Workers is Code Mode. Traditional AI agent tool-calling works by having the LLM issue a sequence of individual tool calls — one API request per tool invocation, each round-tripping through the model. For complex tasks, this can mean dozens of calls, each consuming input tokens for the accumulated conversation history.

Code Mode replaces that pattern: instead of calling tools one by one, the LLM writes a JavaScript/TypeScript program that calls the tools itself. The Dynamic Worker executes that program once, in the sandbox. The result is that the entire tool-calling logic is evaluated in code, not in LLM inference steps. Cloudflare's benchmarks show 80–81% reduction in inference tokens for typical agent workflows.

Approach Tool Calls Inference Tokens Latency
Individual tool calls ~15–20 calls ~50,000 tokens High (sequential)
Code Mode (Dynamic Worker) 1 code generation call ~9,500 tokens (−81%) Lower (parallel in code)
// Code Mode setup with @cloudflare/codemode
import { createCodeMode } from '@cloudflare/codemode';
import { WorkerBundler } from '@cloudflare/worker-bundler';

// Define your tools as a TypeScript API (converted from MCP server)
const tools = {
  async fetchWeather(city: string): Promise<WeatherData> { /* ... */ },
  async lookupInventory(sku: string): Promise<number> { /* ... */ },
  async sendEmail(to: string, body: string): Promise<void> { /* ... */ }
};

// Create Code Mode agent — LLM gets a code() tool instead of individual tools
const agent = createCodeMode({
  model: 'gpt-5.4-pro',
  api: tools,   // tools become callable functions in the generated code
  bundler: new WorkerBundler({ resolveNpm: true })
});

// The LLM writes a program that uses tools as functions
// — no repeated round-trips, all logic runs in one Dynamic Worker execution
const result = await agent.run(
  "Check inventory for SKU-123, get Seattle weather, send a reorder email if stock < 10"
);

The @cloudflare/worker-bundler package handles the critical step of resolving npm dependencies at runtime. When the LLM generates code that imports a library, the bundler fetches and bundles it before execution — meaning the LLM can use standard npm packages without those packages being pre-installed in the parent Worker.

@cloudflare/shell: Virtual Filesystem for Agents

The @cloudflare/shell package gives Dynamic Workers a virtual filesystem — a POSIX-compatible file abstraction that AI agents can use to create, read, update, and delete files as part of agentic tasks, without any real filesystem access. Files exist only in memory for the duration of the Dynamic Worker execution and are discarded when it terminates.

This matters because many LLM coding agents are trained on workflows that involve filesystem operations — creating scratch files, reading CSV data, writing outputs. Without a virtual filesystem, agents either need specialised prompting to avoid file operations or an external object store (like R2) for every ephemeral scratch file. The virtual filesystem handles the common case natively within the sandbox.

// @cloudflare/shell — virtual filesystem in a Dynamic Worker
import { Shell } from '@cloudflare/shell';

// Agent-generated code running inside a Dynamic Worker:
const shell = new Shell();

// Write a scratch file (in-memory only)
await shell.writeFile('/tmp/data.csv', csvContent);

// Standard POSIX-like operations
const lines = (await shell.readFile('/tmp/data.csv')).split('\n');
const filtered = lines.filter(l => l.includes('2026-03'));

await shell.writeFile('/tmp/filtered.csv', filtered.join('\n'));

// Results piped back to parent via typed interface — scratch files never persist
return { output: await shell.readFile('/tmp/filtered.csv') };

Combined with @cloudflare/worker-bundler's npm resolution, agents can now run typical data-processing scripts — parse CSVs, transform JSON, call external APIs, write intermediate results to scratch files — entirely within a single sandboxed Dynamic Worker execution, with no external persistence required.

RFC 9457 Error Payloads and AI Security for Apps GA

Two additional March 2026 announcements ship alongside Dynamic Workers. First: Cloudflare now returns RFC 9457-compliant structured error payloads to AI agents. Previously, when a Worker or Cloudflare service returned an error, agents received a heavyweight HTML error page — useless for programmatic parsing. RFC 9457 defines a JSON application/problem+json format with standardised fields:

// RFC 9457 structured error — what agents now receive from Cloudflare
// Before (HTML — unusable by agents):
// <html><body>Error 1015: You are being rate limited.</body></html>

// After (RFC 9457 JSON — structured, actionable):
{
  "type": "https://errors.cloudflare.com/1015",
  "title": "Rate Limited",
  "status": 429,
  "detail": "Request rate exceeded: 100 req/min limit on this Worker binding",
  "instance": "/api/data-transform",
  "retry_after": 12,           // seconds until retry is safe
  "cf_ray": "8a3f2b1c-LHR"    // Cloudflare Ray ID for debugging
}

Second: Cloudflare AI Security for Apps is now generally available. It provides a security layer that scans traffic to AI-powered applications for prompt injection, jailbreak attempts, and data exfiltration patterns — regardless of the model or hosting provider behind the application. It deploys as a Cloudflare Workers middleware layer with a single configuration change, requiring no changes to the underlying AI application.

5 Key Takeaways for Developers

  1. 1

    Enable Dynamic Workers if you're building AI agents on Cloudflare. The open beta is available now for all paid Workers plans. V8 isolate sandboxing is production-grade — this is not an experimental feature.

  2. 2

    Replace MCP server tool-calling with Code Mode for 80%+ token savings. The economics are compelling — for agents making dozens of tool calls per session, the per-session cost drops dramatically. Measure your current token usage per agent task before and after to quantify the saving.

  3. 3

    Use @cloudflare/shell for scratch-file workflows. Agents trained on general coding tasks expect filesystem access. The virtual filesystem satisfies that expectation without any real persistence risk — all scratch data is discarded at execution end.

  4. 4

    Restrict Dynamic Worker bindings to the minimum necessary. The parent Worker controls what the sandbox can access. Pass only the KV namespaces, R2 buckets, and external fetch origins the agent task genuinely requires. Over-permissive sandboxes defeat the isolation model.

  5. 5

    Parse RFC 9457 error payloads in your agent retry logic. Cloudflare now returns structured JSON errors with retry_after and error type URLs. Update your agent error-handling to consume application/problem+json instead of scraping HTML error text.