Is CVE-2026-15902 a real public CVE as of May 6, 2026?

As of May 6, 2026, public MITRE and NVD searches do not show a matching entry for CVE-2026-15902. For defenders, the safer approach is to track the verified adjacent case, CVE-2026-24176, and remediate the underlying authorization pattern rather than waiting on a specific identifier.

What was actually vulnerable in NVIDIA KAI Scheduler?

NVIDIA describes CVE-2026-24176 as improper authorization through cross-namespace pod references. In practice, that means the scheduler trusted an object relationship strongly enough to act on it without binding the full tenant context behind the reference.

Why are scheduler logic flaws dangerous in multi-tenant GPU clusters?

Because schedulers decide who gets scarce accelerators, when they get them, and under what reservation state. If a tenant can influence that control plane, they may not need a GPU driver exploit to cause data-path risk, starvation, or unauthorized placement.

How should teams harden GPU orchestrators against this class of bug?

Treat every cross-object reference as an authorization boundary. Bind namespace, queue, owner UID, and claim identity together; reject cross-namespace edges by default; and add end-to-end negative tests that submit foreign references and expect a hard deny.

CVE-2026-15902: GPU Orchestrator Logic Flaws [2026]

The headline CVE here needs one correction before any serious analysis starts: as of May 6, 2026, there is no public MITRE or NVD record for CVE-2026-15902. That does not make the threat model hypothetical. A closely matching, vendor-confirmed case already exists in NVIDIA KAI Scheduler's April 2026 security bulletin, where CVE-2026-24176 describes improper authorization through cross-namespace pod references in a multi-tenant GPU scheduler.

CVE Summary Card

Bottom Line

The missing public record for CVE-2026-15902 is itself a reminder to verify identifiers before triage. The actionable security story is the verified NVIDIA case: broken namespace-to-resource authorization in a GPU orchestrator can let one tenant tamper with another tenant's scheduling state.

Requested identifier: CVE-2026-15902
Public status on May 6, 2026: no matching public CVE entry located in MITRE/NVD searches
Verified adjacent case: CVE-2026-24176 in NVIDIA KAI Scheduler
Vendor description: improper authorization through cross-namespace pod references
Severity: 4.3 Medium with vector AV:N/AC:L/PR:L/UI:N/S:U/C:N/I:L/A:N per NVIDIA
Affected versions: all versions prior to v0.13.0
Remediation: update to v0.13.0 or later

Why spend time on a medium-severity scheduler bug? Because CVSS often underrates orchestrator flaws in shared AI platforms. A namespace-scoping error in a CPU-only batch scheduler is already dangerous. In a GPU control plane, it can become the first move in a larger chain involving unfair placement, claim hijacking, noisy-neighbor denial, or indirect data exposure via shared reservation state.

The official anchors for this analysis are NVIDIA's bulletin, the KAI Scheduler repository, and package metadata showing v0.13.0 published on Go Packages on March 2, 2026.

Vulnerable Code Anatomy

Where the trust boundary actually sits

Multi-tenant GPU orchestrators do much more than place pods. They map queue labels, pod groups, reservation objects, dynamic resource claims, admission mutations, and binder actions into a single state graph. The security boundary is not the GPU device node alone. It is the chain of references that says which tenant is allowed to consume, mutate, or wait on a particular reservation.

KAI Scheduler's public docs and issue traces expose the moving parts clearly enough to reason about the flaw class:

The scheduler supports GPU Sharing, Dynamic Resource Allocation, queues, and PodGroups.
The control plane includes components such as admission, binder, scheduler, and podgrouper.
Public issue traces reference binder paths such as pkg/binder/binding/resourcereservation/resource_reservation.go, which is where reservation state is coordinated.

That architecture is efficient, but it creates a classic logic-flaw hazard: one controller resolves a reference, another authorizes it, and a third performs the action. If any stage treats namespace, queue, owner, or claim identity as optional context instead of mandatory identity, the system can accept a cross-tenant edge that should never exist.

Illustrative vulnerable pattern

The following pseudocode is conceptual, not vendor source, but it captures the failure mode implied by the advisory:

func authorizeReservationRef(req PodSpec) bool {
  ref := req.annotations["reservationRef"]
  obj := reservations.Get(ref.name)          // name-only lookup

  if obj.queue == req.labels["kai.scheduler/queue"] {
    return true
  }

  return false
}

The mistake is subtle:

The lookup is effectively cluster-wide or insufficiently scoped.
The authorization decision trusts a user-controlled or weakly bound queue label.
The object identity is not tied to namespace + queue + owner + claim as a single composite key.

Once you see the pattern, the exploit path is obvious even without device-level escape primitives. The attacker does not need to break CUDA memory isolation first. They only need to coerce the control plane into believing a foreign reservation belongs to their workload.

What changed in the fixed line

NVIDIA's public guidance is simply “upgrade to v0.13.0 or later,” but later release notes also show a broader hardening direction. The v0.13.0 line includes changes such as blocking pods with shared DRA GPU claims that lack a queue label or have a mismatched queue label. That is exactly the kind of tightening you would expect after discovering that identity fields were too loosely coupled.

Attack Timeline

March 2, 2026: package metadata shows KAI Scheduler v0.13.0 published, establishing the fixed version line.
April 2026: NVIDIA publishes its KAI Scheduler security bulletin covering CVE-2026-24176 and CVE-2026-24177.
April 21, 2026: NVIDIA's bulletin revision history lists its initial release on this date.
May 6, 2026: searches for CVE-2026-15902 still do not surface a public MITRE or NVD record, so defenders should avoid pinning response workflows to that identifier alone.

The practical takeaway from the timeline is that release artifacts can precede or outlast clean public vulnerability indexing. If your AI platform depends on third-party schedulers, package versions and vendor bulletins often matter more than whether every downstream database has caught up.

Exploitation Walkthrough

Watch out: This walkthrough is conceptual only. It explains the control-plane failure pattern without providing a working exploit or operational payload.

Preconditions

The attacker already has legitimate access to one tenant namespace.
The cluster uses a GPU-aware scheduler with reservation or claim indirection.
Authorization is checked on references, but not on the full tenant context behind those references.

Conceptual attack sequence

The attacker studies accepted workload fields such as labels, annotations, PodGroup references, reservation names, or DRA claim handles.
They identify a field path where the scheduler dereferences an object by name or loose selector rather than by strict namespace-owner binding.
They submit a crafted workload that points at, collides with, or shadows another tenant's scheduler object.
The admission layer accepts the object because the reference looks structurally valid.
The binder or scheduler resolves the reference and updates scheduling state as if the attacker's workload were entitled to it.
The result is one of three outcomes: unauthorized placement, reservation theft, or state tampering that starves or blocks another tenant.

Why this matters even without VRAM reads

Security teams often prioritize GPU bugs only when they promise direct memory disclosure. That is too narrow. A scheduling flaw can still cause serious damage:

Availability impact: an attacker can delay, starve, or deadlock expensive training and inference jobs.
Integrity impact: tenant B's job may run under an altered reservation topology or resource envelope.
Confidentiality side effects: once placement guarantees break, downstream assumptions about data locality, dedicated nodes, or trusted peers may also fail.

In other words, the orchestrator is part of the trusted computing base. If it lies about ownership, the rest of the platform inherits that lie.

Hardening Guide

Immediate remediation

Upgrade KAI Scheduler to v0.13.0 or later wherever NVIDIA's bulletin applies.
Audit workloads for queue-label drift, especially where queue identity is set by user manifests rather than admission policy.
Review RBAC on scheduler CRDs, reservation objects, and any API used by binder or pod-group controllers.

Control-plane defenses that actually matter

Use composite authorization keys. Every reservation lookup should bind namespace + queue + owner UID + claim UID.
Deny cross-namespace references by default. Make exceptions explicit, typed, and logged.
Move trust from labels to server-owned fields. User-editable labels are useful selectors, not identity roots.
Re-validate at action time. Authorization at admission is not enough if the binder can act on stale or reinterpreted references later.
Write negative tests. Most orchestration teams test that valid references work. Fewer test that near-valid foreign references fail.

Operational hygiene for AI platforms

Separate workload namespaces from scheduler system namespaces, following KAI's own installation guidance.
Tag audit events when reservation creation, claim binding, and GPU assignment cross controller boundaries.
Mask real tenant identifiers in shared incident artifacts before sending traces across teams; TechBytes' Data Masking Tool is useful when support logs contain namespace, queue, or dataset names.
Continuously diff effective authorization state against desired tenancy policy, not just Kubernetes YAML.

Pro tip: For multi-tenant GPU clusters, the highest-value security tests are end-to-end policy tests that submit intentionally invalid cross-tenant references and assert a hard deny at every controller hop.

Architectural Lessons

1. GPU security starts above the GPU

The industry still frames AI infrastructure risk around drivers, runtimes, and device isolation. Those layers matter, but modern GPU platforms are orchestrator-heavy systems. A scheduler that misbinds tenant identity can nullify perfect device isolation by sending the wrong work, claim, or reservation into the wrong execution path.

2. Reference graphs are attack surfaces

Kubernetes-native AI platforms are built from graphs: pods reference groups, groups reference queues, queues imply quotas, claims point to devices, binders create reservations, and controllers reconcile all of it asynchronously. Every edge in that graph is an authorization decision. If even one edge uses convenience semantics such as name-only lookup, inferred namespace, or mutable label trust, attackers get room to maneuver.

3. Medium CVSS can still mean high blast radius

CVE-2026-24176 is not scored like a critical RCE, but blast radius is environment-dependent. In a premium multi-tenant GPU cluster, a single unauthorized reservation mutation can disrupt jobs worth far more than a typical “medium” label suggests.

4. Release notes are security documents

One of the most useful signals in this case is not the CVE text itself but the surrounding release behavior: a fixed version line, stronger queue-label validation, and continued tightening around shared GPU claims. For platform teams, that is a reminder to read scheduler changelogs the same way they read kernel advisories.

The durable lesson is simple. Whether the identifier eventually lands as CVE-2026-15902, stays private, or turns out to be a mistaken reference, the exploit class is already here in public: multi-tenant GPU orchestrators fail dangerously when tenant identity is reconstructed from weak references instead of enforced as a first-class security boundary.

CVE-2026-15902: GPU Orchestrator Logic Flaws [2026]

Bottom Line

CVE Summary Card

Bottom Line

Vulnerable Code Anatomy

Where the trust boundary actually sits

Illustrative vulnerable pattern

What changed in the fixed line

Attack Timeline

Exploitation Walkthrough

Preconditions

Conceptual attack sequence

Why this matters even without VRAM reads

Hardening Guide

Immediate remediation

Control-plane defenses that actually matter

Operational hygiene for AI platforms

Architectural Lessons

1. GPU security starts above the GPU

2. Reference graphs are attack surfaces

3. Medium CVSS can still mean high blast radius

4. Release notes are security documents

Frequently Asked Questions

Get Engineering Deep-Dives in Your Inbox