Event-Driven APIs: AsyncAPI on NATS JetStream [2026]
Bottom Line
AsyncAPI gives event systems a contract surface; JetStream gives them durable transport, replay, and consumer control. Together, they turn pub/sub from an integration convenience into a governable production API layer.
Key Takeaways
- ›AsyncAPI 3.1.0 is the current spec and remains a backward-compatible step beyond 3.0.0.
- ›NATS server v2.14.0 extends JetStream with fast-ingest batch publishing and a consumer reset API.
- ›Official JetStream async publish benchmarks show 403,828 msgs/sec with 128 B messages on file storage.
- ›Pull consumers with AckExplicit are the safest default for horizontally scaled worker fleets.
Synchronous REST still dominates public API design, but internal systems increasingly live on events: order pipelines, audit trails, fraud signals, ML features, and agent workflows. The gap has been documentation and control. AsyncAPI 3.1.0 now gives event-driven systems a mature contract model, while NATS server v2.14.0 and JetStream provide the durability, replay, and consumer semantics needed to run those contracts in production rather than in architecture diagrams.
- AsyncAPI 3.1.0 is the current spec and remains a backward-compatible step beyond 3.0.0.
- NATS server v2.14.0 extends JetStream with fast-ingest batch publishing and a consumer reset API.
- Official JetStream async publish benchmarks show 403,828 msgs/sec with 128 B messages on file storage.
- Pull consumers with AckExplicit are the safest default for horizontally scaled worker fleets.
The Lead
Bottom Line
Use AsyncAPI to make events reviewable and JetStream to make them durable. The combination works best when subjects stay simple, consumers stay pull-based, and replay is treated as a first-class operational capability.
Most teams adopting events hit the same ceiling: producers move fast, consumers multiply, and the system becomes harder to reason about than the monolith it replaced. OpenAPI solved part of that problem for HTTP by turning interfaces into contracts. Async systems need the same discipline, but with more moving parts: subject naming, retention, replay rules, delivery semantics, backpressure, and failure recovery.
Why event APIs break down without a contract
- Subject names drift because nobody owns a shared taxonomy.
- Payloads evolve informally, so downstream breakage is detected in production.
- Replay semantics are undocumented, even when the broker supports them.
- Consumers inherit hidden operational assumptions around ack timing and delivery mode.
That is the practical case for pairing AsyncAPI with JetStream. AsyncAPI defines what can be published or consumed; JetStream defines how those messages survive failures, scale across workers, and get replayed when reality diverges from design.
Architecture & Implementation
Model the event surface first
Start by treating subjects as stable API resources. In AsyncAPI, document the server, channel address, payload schema, and operation direction. Keep the subject hierarchy shallow enough to reason about ownership, but expressive enough to support filtering and routing.
asyncapi: 3.1.0
info:
title: Orders Event API
version: 1.0.0
servers:
production:
host: nats.example.internal:4222
protocol: nats
channels:
ordersCreated:
address: orders.created
messages:
orderCreated:
payload:
type: object
properties:
orderId:
type: string
occurredAt:
type: string
operations:
publishOrderCreated:
action: send
channel:
$ref: '#/channels/ordersCreated'
messages:
- $ref: '#/messages/orderCreated'
The important design move is not the YAML. It is the boundary it creates. Once a subject and payload live in a versioned contract, reviewers can ask the right questions before code ships: who owns this event, how long is it retained, can it be replayed, and what happens if consumers fall behind?
Map contracts onto streams deliberately
JetStream does not store subjects directly; it stores streams that bind subject sets to persistence, replication, retention, and limits. That means the key implementation choice is not merely naming a subject, but deciding which subjects belong to the same operational domain.
nats-server -js
nats stream add ORDERS --subjects "ORDERS.*" --ack --max-msgs=-1 --max-bytes=-1 --max-age=1y --storage file --retention limits --max-msg-size=-1 --discard=old
nats consumer add ORDERS NEW --filter ORDERS.received --ack explicit --pull --deliver all --max-deliver=-1 --sample 100
nats consumer add ORDERS DISPATCH --filter ORDERS.processed --ack explicit --pull --deliver all --max-deliver=-1 --sample 100
That CLI pattern matters because it encodes the production defaults most teams actually need:
- File storage when replay matters more than maximum ephemeral speed.
- Limits retention when streams are event logs rather than one-message-per-task queues.
- Explicit acks so redelivery is observable and intentional.
- Pull consumers so worker pools control backpressure instead of receiving unbounded pushes.
NATS documentation also recommends thinking carefully about replication. For most production streams, Replicas=3 is the practical high-availability target, while Replicas=2 tends to add complexity without the same fault tolerance payoff.
Choose consumer semantics on purpose
JetStream gives teams enough consumer modes to be dangerous. The cleanest default for business workflows is a durable pull consumer with AckExplicit. It is easier to scale horizontally, easier to reason about under partial failure, and easier to benchmark honestly.
- Choose pull consumers for worker fleets, batch processors, and services that need predictable backpressure.
- Choose push consumers for simpler fan-out or low-friction delivery into subscribers that can absorb flow continuously.
- Use work-queue retention only when the stream should act like a task queue rather than a replayable event log.
- Separate integration events from commands; they age differently and deserve different retention policies.
Build replay into the design, not the postmortem
Replay is where JetStream changes the economics of event-driven APIs. Instead of treating failures as one-off incidents, you can model recovery paths directly:
- Bootstrap a new service by replaying retained historical subjects.
- Rebuild a derived read model after a schema or logic bug.
- Backfill analytics without asking upstream teams for special exports.
- Test consumer correctness against real traffic patterns instead of synthetic fixtures.
That only works if the contract says what replay means. AsyncAPI should describe the message shape; your platform conventions should describe replay windows, ordering assumptions, idempotency keys, and whether events are immutable facts or mutable state notifications.
Benchmarks & Metrics
Broker benchmarks are easy to weaponize and easy to misread. The only useful benchmark is one tied to a concrete workload. NATS publishes representative JetStream numbers through nats bench; they are valuable because they show how performance changes when you introduce persistence instead of measuring broker speed in a vacuum.
| Workload | Configuration | Official Result | What It Means |
|---|---|---|---|
| Async publish | 100,000 messages, 128 B, file storage, replicas=1 | 403,828 msgs/sec, about 49 MiB/sec, ~2.48 us/msg | JetStream can sustain durable ingest rates that are already high enough for many internal APIs. |
| Synchronous publish | Same message shape and storage profile | 31,018 msgs/sec, about 3.79 MiB/sec | Round-trip confirmation is safer but much more expensive per message. |
| Consume | 4 clients, 100,000 messages, 128 B | 382,837 msgs/sec, about 46.73 MiB/sec | Well-tuned pull consumers can keep up with durable publish rates. |
| Core NATS baseline | Pub-sub transport, no JetStream persistence | Official examples show sub-millisecond latencies with millions of messages per second possible | Use core transport numbers as an upper bound, not a planning number for persisted workloads. |
The bigger lesson is architectural, not numerical. The step from core NATS to JetStream is the step from transport to platform capability. You trade some raw speed for features that usually matter more in production: replay, consumer state, retention, and recovery.
Metrics worth watching before you tune anything
- Consumer lag: the fastest signal that workers are under-provisioned or blocked downstream.
- Redelivery count: reveals retry storms, slow handlers, or broken idempotency.
- Ack latency: a better operational metric than raw message rate for business pipelines.
- Stream depth and age: tells you whether retention is serving replay or quietly hiding stuck consumers.
- JetStream advisories on subjects like
$JS.EVENT.ADVISORY.>: useful for platform-wide operational hooks.
Measure those first, then benchmark the exact producer and consumer profile you intend to run. For most teams, the bottleneck is not the broker; it is the application logic attached to the event.
Strategic Impact
The real payoff of AsyncAPI plus JetStream is not that it makes pub/sub prettier. It changes how platform teams govern asynchronous systems.
What changes for engineering organizations
- Events stop being tribal knowledge and become reviewable interface assets.
- Replay becomes a standard recovery tool instead of a custom incident response script.
- Consumer onboarding gets faster because schemas, subjects, and expectations are explicit.
- Platform teams can standardize naming, retention classes, and consumer templates across domains.
- Security teams gain a cleaner place to review payload exposure and data handling boundaries.
That strategic shift matters because event-driven systems fail socially before they fail technically. A fast broker does not help if every team interprets the same event differently. AsyncAPI narrows that ambiguity. JetStream narrows the operational blast radius when one consumer is slow, broken, or newly introduced.
Where this pattern fits best
- Internal platform APIs where multiple services publish and consume the same business facts.
- Operational pipelines that need deterministic replay after defects or outages.
- Hybrid request-plus-event architectures where REST handles commands and JetStream carries facts.
- AI and analytics systems that need retained event history to rebuild features or audit decisions.
Where it fits less well is equally important. If the workload is purely ephemeral, tightly synchronous, or has only one producer and one consumer with no replay need, core NATS or even plain HTTP may be enough. JetStream earns its keep when persistence and consumer state are first-order concerns.
Road Ahead
As of May 21, 2026, the trajectory is clear. AsyncAPI is stable enough to act as a real contract system for events, and JetStream keeps adding the operational features teams asked for after the first wave of event adoption.
What is newly relevant in 2026
- AsyncAPI 3.1.0 is a backward-compatible minor release, which lowers upgrade friction for teams already on the 3.x model.
- NATS server v2.14.0 adds fast-ingest batch publishing, repeating and cron-style schedules, and a consumer reset API.
- JetStream also adds domain-aware ack and flow-control subjects behind the new jsackfc_v2 behavior, which matters for more complex topologies.
A practical adoption sequence
- Document one business domain in AsyncAPI before trying to standardize the whole company.
- Back only the important subjects with JetStream streams and explicit retention classes.
- Default to durable pull consumers and require idempotent handlers.
- Publish replay policy, ownership, and schema evolution rules as part of platform governance.
- Benchmark the real handler path, not just the broker path.
The architectural mistake to avoid is treating AsyncAPI as documentation only and JetStream as infrastructure only. The value appears when they are designed together: contracts that describe the event surface and runtime primitives that preserve, replay, and observe that surface under failure. That is what turns an event bus into an API platform.
Frequently Asked Questions
What is the difference between NATS and JetStream? +
Should AsyncAPI describe NATS subjects or JetStream streams? +
When should I use pull consumers in JetStream? +
Can JetStream provide exactly-once delivery? +
Get Engineering Deep-Dives in Your Inbox
Weekly breakdowns of architecture, security, and developer tooling — no fluff.