AI Code Supply Chain Attacks: Ghost Packages [2026]
Bottom Line
The core risk is no longer that one model invents one bad dependency. It is that multiple frontier models can repeatedly invent the same package names, giving attackers a stable target for registry abuse.
Key Takeaways
- ›A May 16, 2026 study found 127 package names hallucinated identically across 5 frontier models.
- ›The 2024 benchmark measured hallucinated packages at 5.2% for commercial models and 21.7% for open models.
- ›PyPI disclosed a July 8, 2025 alert tied to an LLM recommending a non-existent project.
- ›Lockfiles, hash pinning, script suppression, and PR dependency review cut most of the blast radius.
- ›Treat AI-generated dependency names as untrusted input, not as implementation detail.
AI coding assistants have changed the shape of software supply chain risk. The classic problem was typo-driven package confusion; the new problem is model-driven package invention. When a model suggests a dependency that does not exist, and an attacker races to publish it, the exploit path looks deceptively ordinary: install, resolve, run scripts, leak secrets. What makes this dangerous in 2026 is repeatability. The same ghost package names can appear across multiple models, which turns hallucination into attacker reconnaissance.
- 127 shared hallucinated package names were observed across five frontier models in a May 16, 2026 re-evaluation.
- The original large study analyzed 576,000 generated samples and found hallucination rates of 5.2% for commercial models and 21.7% for open-source models.
- PyPI disclosed that on 2025-07-08 admins were alerted by a user whose LLM recommended a project that did not exist.
- The exploit usually lands during dependency installation, before application code is even reviewed or run.
CVE Summary Card
Bottom Line
This is not one bug in one package manager. It is a cross-ecosystem failure mode where AI-generated dependency names become attacker-controlled execution points.
- Identifier: No single CVE assigned as of May 21, 2026.
- Incident class: Supply chain attack via AI-generated code and hallucinated dependencies, often called slopsquatting.
- Affected surface: AI-assisted development, coding agents, CI pipelines, package installation hooks, developer laptops.
- Ecosystems named in public research: PyPI and npm.
- Verified public signals: PyPI reported an alert on 2025-07-08 involving an LLM-recommended non-existent project; PyPI later removed 1,500+ projects from a spam campaign tied to 250+ new accounts.
- Why it matters: The May 2026 replication found 127 hallucinated names shared across Claude Sonnet 4.6, Claude Haiku 4.5, GPT-5.4-mini, Gemini 2.5 Pro, and DeepSeek V3.2.
The important framing point is that this is an incident pattern, not a single implementation defect. That matters operationally. You do not patch one package and move on. You redesign the trust boundary around AI output.
Vulnerable Code Anatomy
The vulnerable pattern starts with a normal prompt: build a login flow, add JWT support, parse PDFs, send analytics. The model returns code plus a dependency that sounds plausible enough to pass a fast review. That dependency becomes the weakest link because package managers are optimized to fetch and execute software, not to ask whether the software should exist in the first place.
What the AI-generated mistake looks like
# AI-generated example for illustration only
# The package name below is intentionally fictitious.
pip install ghost-auth-utils
from ghost_auth_utils import verify_token
def handle_request(token):
return verify_token(token)Nothing here looks obviously malicious. The problem is semantic, not syntactic. The model invented ghost-auth-utils; the developer trusts the suggestion; an attacker can register that exact name and attach a payload to installation or import-time behavior.
Where execution happens
- Install hooks: npm packages can execute preinstall, install, and postinstall scripts unless ignore-scripts is enabled.
- Transitive resolution: A hallucinated direct dependency can pull in additional attacker-controlled code through ordinary dependency graphs.
- CI secrets exposure: The blast radius is larger in pipelines because environment variables, repository tokens, cloud credentials, and signing material are already present.
- Agent autonomy: Coding agents compress the gap between suggestion and execution; the same system can propose, install, test, and commit the dependency.
Why the false package looks believable
- It often combines familiar words from real libraries.
- It matches the naming style of the target ecosystem.
- It appears in otherwise correct code, which borrows credibility from surrounding output.
- Repeated hallucinations train teams to normalize the pattern instead of rejecting it.
Attack Timeline
This timeline shows how the issue moved from research curiosity to operational supply chain concern.
- 2024-06-12: The first large academic study was submitted to arXiv, framing package hallucinations as a novel package confusion threat.
- 2025-03-02: The paper's revised version noted results across 16 popular code-generation models and 205,474 unique hallucinated package names.
- 2025-07-08: PyPI says administrators were alerted by a user whose large language model recommended installing a project that did not exist.
- 2025-07-15: PyPI disclosed it had blocked
inbox.ruregistrations after a spam campaign created 250+ accounts and published 1,500+ projects, which PyPI removed. - 2026-05-16: A re-evaluation of five frontier models found hallucination rates had narrowed to roughly 4.62% to 6.10%, but also identified 127 package names hallucinated identically across all five models.
The timeline matters because the headline is not 'models got worse.' The headline is subtler: individual rates improved, but the remaining failures became more predictable across models. That predictability is exactly what an attacker needs.
Exploitation Walkthrough
1. Observe stable hallucinations
An attacker probes common coding prompts across models and records package names that recur. Shared names are ideal because they are more likely to surface in real developer workflows over time.
2. Publish a plausible package
The attacker registers the name on a public registry with benign-looking metadata, a short README, and a version history that appears normal. The goal is not sophistication. It is reducing the chance that the package looks obviously fake.
3. Attach an install-time or early-runtime payload
The payload does not need a flashy exploit chain. It only needs one useful primitive:
- Read environment variables.
- Capture tokens from CI runners.
- Drop a backdoor into build artifacts.
- Modify generated files before tests execute.
- Beacon to external infrastructure for follow-on instructions.
4. Wait for AI-assisted installation
The victim path is usually mundane: a developer copies the command, an agent resolves the dependency automatically, or a CI job runs the install after a generated patch is merged.
5. Pivot through trust already present
Once code runs inside CI, the attacker inherits the conveniences the team built for itself: package publishing tokens, cloud roles, signing keys, deployment credentials, and access to private repos.
This is why the attack is so effective. The adversary does not break your cryptography. They borrow your automation.
# Safer pattern: treat AI dependency names as untrusted input
# Example policy sketch only
candidate = 'ghost-auth-utils'
if candidate not in approved_allowlist:
raise SecurityError('Unapproved dependency suggested by AI')
install_with_locked_manifest(candidate)The difference between compromise and containment is often whether the install path was policy-gated before execution.
Hardening Guide
The controls below are practical because they target the exact choke points attackers need.
Gate dependency introduction
- Require every new dependency to enter through a reviewed manifest change, never through an ad hoc shell command in CI.
- Use GitHub's dependency review action so pull requests show dependency diffs before merge.
- Maintain an allowlist for high-sensitivity services and production agents.
Freeze what gets installed
- In Node projects, prefer npm ci in automation so the install is locked to
package-lock.jsonand fails on manifest drift. - In Python deployments, use --require-hashes so each requirement is verified against an expected digest.
- Reject builds that introduce packages outside the locked graph.
Reduce install-time execution
- Use ignore-scripts where packages do not genuinely need lifecycle hooks.
- Split build jobs so package resolution happens in a low-privilege environment, while signing and deployment happen later in isolated stages.
- Do not expose production secrets to test jobs that only need read-only repository access.
Strengthen package publishing and provenance
- Prefer Trusted Publishing on PyPI and npm so release pipelines use short-lived OIDC-based credentials instead of long-lived tokens.
- Enable npm provenance and verify attestations where your toolchain supports it.
- Use artifact attestations for internal releases so downstream systems can verify where builds came from.
Harden human and agent workflows
- Treat all AI-suggested package names as untrusted until validated against the registry, maintainer history, and project provenance.
- Disable autonomous install-and-run loops for coding agents on sensitive repositories.
- Redact logs, prompts, and generated patches before sharing them externally; if you need a quick sanitization step, TechBytes' Data Masking Tool is a good fit for security-heavy workflows.
Architectural Lessons
The deeper lesson is not 'AI makes mistakes.' Engineers already know that. The lesson is that AI output now crosses security boundaries that used to be separated by friction.
Lesson 1: Dependency names are data, not truth
Historically, engineers treated import lines as developer intent. In AI-assisted systems, an import can be model output with no provenance. Architecture needs to classify dependency suggestions as untrusted input until verified.
Lesson 2: Shared model failure creates attacker economies of scale
A typo is individual. A stable hallucination across five models is industrial. The May 2026 finding of 127 shared hallucinated names is the most important number in this story because it suggests attackers can pre-position packages for a whole market of tools, not just one assistant.
Lesson 3: The real asset is your automation context
The package itself is only the delivery vehicle. The prize is the environment around it:
- CI variables
- release credentials
- cloud identities
- signing workflows
- private source trees
If those are broadly available during dependency installation, a minor package event becomes a platform compromise.
Lesson 4: AI security controls belong in your SDLC, not in a prompt style guide
NIST's SSDF and the newer SP 800-218A community profile for generative AI point in the right direction: operational controls, defined trust boundaries, secure defaults, and repeatable review mechanisms. Prompt advice helps. Pipeline policy matters more.
The ghost in the machine is not mystical. It is a missing trust check. Once AI-generated code can introduce dependencies, the secure architecture answer is straightforward: validate package existence, verify provenance, freeze resolution, limit install-time execution, and isolate credentials. Teams that do that will treat package hallucinations as noisy failures. Teams that do not will keep turning ordinary prompts into supply chain events.
Frequently Asked Questions
What is slopsquatting in AI-generated code? +
Is there a CVE for AI package hallucination attacks? +
How do I stop AI-generated dependencies from reaching production? +
Are newer frontier models still vulnerable to package hallucinations? +
Get Engineering Deep-Dives in Your Inbox
Weekly breakdowns of architecture, security, and developer tooling — no fluff.