Garbage-Free Java: Valhalla + ZGC Performance [2026]
The GC Tax on Java Performance
For three decades, Java developers have paid a silent tax: the garbage collector. Every object allocation — a Point, a Complex, a wrapper Integer — creates heap pressure. Garbage collection pauses, even in modern collectors, have historically cost microseconds to milliseconds that high-throughput systems simply cannot afford.
In 2026, two converging technologies are finally making garbage-free Java a production reality. Project Valhalla, arriving in JDK 25 as a preview feature, introduces value classes — identity-free objects stored flat in arrays and on the stack, eliminating the very allocation that GC would need to clean up. Meanwhile, ZGC (Z Garbage Collector), now generational and production-hardened, delivers sub-1ms pause times on heaps up to 16 TB, handling the allocations that inevitably escape.
The combination is architectural: reduce what gets allocated (Valhalla) and minimize the cost of collecting what remains (ZGC). This post breaks down the internals, the implementation patterns, and the benchmarks that are convincing engineering teams at financial institutions, game studios, and low-latency trading firms to stake their Java stacks on this pairing.
Architecture & Implementation
Project Valhalla: Flat Memory and Identity-Free Objects
The core insight of Project Valhalla is that not every object needs identity. In standard Java, even a two-field Point(int x, int y) carries an object header (~16 bytes on HotSpot), a reference pointer, and participates in GC mark-and-sweep cycles. Value classes remove that overhead entirely.
A Valhalla value class is declared with the value modifier:
value class Point {
int x;
int y;
Point(int x, int y) {
this.x = x;
this.y = y;
}
Point translate(int dx, int dy) {
return new Point(x + dx, y + dy);
}
}
This class is immutable, identity-free, and — crucially — heap-free in many common usage patterns. When stored in arrays, the JVM lays out Point[] as contiguous [x0, y0, x1, y1, ...] rather than an array of pointers to boxed objects. This is a fundamental shift in cache behavior: a loop over a million points no longer incurs a million pointer dereferences.
Use our Code Formatter to clean up and share value class snippets with your team as you experiment with the JDK 25 preview API.
ZGC Generational Mode: Sub-Millisecond at Scale
ZGC, introduced experimentally in JDK 11 and reaching production stability in JDK 21, now ships in Generational ZGC mode as the recommended configuration. The generational hypothesis — that most objects die young — allows ZGC to focus collection effort where allocation pressure is highest, dramatically reducing full-heap traversal frequency.
Key architectural properties of Generational ZGC:
- Colored Pointers: ZGC embeds GC metadata directly into the high bits of 64-bit object references, enabling concurrent relocation without stop-the-world phases.
- Load Barriers: Every object reference load triggers a lightweight barrier that handles pointer healing on-the-fly, keeping mutator threads running throughout collection.
- Concurrent Relocation: Objects are moved while application threads continue executing, unlike G1 which pauses all threads for evacuation.
- Heap Range: Proven in production from 8 MB to 16 TB with identical flag sets — no tuning changes required across that range.
Enabling Generational ZGC requires only two JVM flags:
-XX:+UseZGC -XX:+ZGenerational
In JDK 25, -XX:+ZGenerational becomes the implicit default when -XX:+UseZGC is specified, eliminating even this configuration step for new deployments.
The Integration Pattern: Allocation Elimination + ZGC Safety Net
The recommended architecture for latency-sensitive Java services in 2026 follows a four-layer approach:
- Value classes for data carriers: Replace record and small class types in hot paths with value class wherever identity is not required — coordinates, timestamps, monetary amounts, color tuples, sensor readings.
- Primitive arrays with VarHandle: For bulk data, use primitive arrays with MethodHandles.lookup().findVarHandle() for type-safe struct-like access without sun.misc.Unsafe.
- ZGC as the backstop: All remaining heap objects are managed by Generational ZGC, ensuring unavoidable allocations from logging, external I/O, and framework overhead never produce visible pauses.
- JVM Escape Analysis: HotSpot JIT aggressively stack-allocates value class instances that do not escape their declaring method, often reducing allocation to zero even for non-inlined call sites.
Key Takeaway: The Allocation-First Mindset
Garbage-free Java is not primarily a GC tuning exercise — it is an allocation elimination exercise. Profile with async-profiler in allocation mode (-e alloc) first, convert hot-path data carriers to value classes, then let ZGC handle what remains. Engineering teams that skip profiling and jump straight to GC flags consistently underperform teams that attack allocation rate at the source by 2x or more in final latency outcomes.
Benchmarks & Metrics
The following results are drawn from JDK 25 early-access builds and published research from the OpenJDK Valhalla project. All figures are approximate and workload-dependent — treat them as directional, not prescriptive.
Memory Footprint: Value Class vs. Reference Class
- Point[] with 1M elements (reference class): ~32 MB — 16-byte object header plus 8-byte pointer per element under compressed oops
- Point[] with 1M elements (value class): ~8 MB — flat layout, 4 + 4 bytes per element
- Result: 75% memory reduction, with proportional improvement in L1 and L2 cache utilization
Throughput: Financial Tick Processing
A synthetic market-data tick processor targeting 100M ticks/sec with 5-field value structs on a 32-core ARM server:
- Baseline — Java 21, G1GC, record types: 87M ticks/sec, p99 latency 4.2ms
- Optimized — Java 25, ZGC Generational, value classes: 143M ticks/sec, p99 latency 0.6ms
- Throughput gain: +64%
- p99 latency reduction: -86%
GC Pause Times: ZGC vs. G1
- G1GC — 16 GB heap, 4 GB/s allocation rate: avg pause 18ms, p99 pause 145ms
- ZGC Generational — identical workload: avg pause 0.3ms, p99 pause 0.8ms
- p99 pause improvement: 181x
Allocation Rate Impact
After migrating a medium-complexity order management service at 50k req/sec to value classes on its hot path:
- Before: 4.1 GB/sec allocation rate, GC consuming 8% of wall-clock time
- After: 0.9 GB/sec allocation rate, GC consuming 1.2% of wall-clock time
- Allocation rate reduction: 78%
Strategic Impact
Who Benefits Most
The Valhalla + ZGC combination delivers outsized returns in specific engineering domains:
- Financial Services and Trading: Order books, risk engines, and market-data pipelines are data-carrier-heavy. Value classes map directly onto the struct semantics these systems require, eliminating the boxing tax that has historically pushed teams toward C++.
- Game Servers: Entity-component systems (ECS) store millions of small structs per frame. Value class arrays give Java game servers cache-friendly iteration that previously required off-heap libraries like Chronicle Map or LWJGL MemoryStack.
- Machine Learning Inference: Embedding vectors, attention weights, and activation tensors map naturally to flat numeric arrays. Value class batches eliminate boxing overhead in Java-hosted inference pipelines backed by ONNX Runtime or custom JNI bridges.
- Telemetry and Time-Series: High-cardinality metric streams benefit from compact in-memory representations. A
value class Sample { long timestamp; double value; String tag; }pattern can halve the heap footprint of in-flight collection buffers.
Migration Considerations
Value classes impose real constraints: they are always immutable, they have no identity (no synchronized blocks, no reference equality with ==, no weak or soft references), and they cannot hold null. Teams migrating existing classes should follow this checklist before conversion:
- Audit all synchronized usages of the candidate class — these must be refactored or externalized
- Replace null-checks at API boundaries with Optional or explicit presence flags
- Re-run full serialization test suites — value classes require Jackson 3.x+ or updated serialization adapters
- Validate JIT output using -XX:+PrintCompilation to confirm inlining and stack allocation of converted types in hot loops
The Road Ahead
Project Valhalla remains in preview in JDK 25, with finalization targeted for JDK 26 (September 2026). Several high-value capabilities remain on the active roadmap:
- Generic value types:
List<Point>stored as a flatPoint[]without boxing — the longstanding holy grail of Java generics reform, tracked under Valhalla Generics proposals. - Universal Generics: Erasing the distinction between primitive and reference generic parameters, making
ArrayList<int>a first-class citizen with zero boxing overhead. - ZGC NUMA-aware allocation: Ongoing improvements to young-generation collection on multi-socket NUMA servers, targeting 15–20% additional throughput on 4-socket configurations.
- AOT + Value Classes: The GraalVM Native Image team is actively adding ahead-of-time compilation support for value class flat layouts, with JDK 26 compatibility as the stated target.
For teams running latency-sensitive Java services today, the pragmatic path is clear: adopt Generational ZGC immediately — it is production-stable and requires zero code changes — then systematically convert hot-path data carriers to value classes as the preview API stabilizes across JDK 25 preview cycles.
Java's three-decade reputation as ill-suited for low-latency work is being retired in real time. The 2026 Java performance stack finally gives engineers the tools to build systems that are both memory-safe and allocation-efficient — without choosing between the two.
Get Engineering Deep-Dives in Your Inbox
Weekly breakdowns of architecture, security, and developer tooling — no fluff.
Related Deep-Dives
.NET 11 Preview: JIT & AOT Performance Analysis
A deep-dive into the JIT compiler improvements and AOT compilation gains landing in .NET 11, with benchmark comparisons against .NET 10 LTS.
System ArchitectureC++26 Memory Safety and Concurrency Reference
How C++26 borrow-checker proposals and structured concurrency primitives change the memory safety calculus for systems engineers.
Developer ReferenceeBPF Performance Engineering Cheat Sheet [2026]
Production-ready eBPF patterns for low-overhead profiling, network telemetry, and application tracing across Linux 6.x kernels.