TL;DR
The configuration files that tell an AI coding tool “work this way on this project” (.cursorrules, CLAUDE.md) are now everyday development items. But whose hands placed those instructions, and with what provenance, is not checked at the moment the AI reads and obeys them. In May 2026, Socket disclosed TrapDoor, a credential-stealing campaign spanning npm, PyPI, and Crates.io. Its distinctive technique is that, on top of the usual theft of keys and cloud credentials, it plants invisible directives via zero-width Unicode in the instruction files meant for AI assistants, getting the AI to perform what it calls a “security scan” and carry secrets out. (The packages, PRs, and attacker documents are detailed below.) We analyze this as a structure in which the provenance and authorization of the instructions an AI obeys are not independently verified before execution, from the standpoint of a division of labor with detection. It connects to Briefs 037, 018, 024, and 028.
Incident overview
- The campaign: Tracked by Socket as TrapDoor. More than 34 packages and 384+ versions/artifacts spanning npm, PyPI, and Crates.io. It targets developers in the crypto, DeFi, Solana, and AI communities, stealing SSH keys, various wallets, AWS credentials, GitHub tokens, browser data, environment variables, and more.
- Disclosure: 2026-05-24, Socket Research Team. The earliest observed package is PyPI’s
eth-security-auditor@0.1.0(2026-05-22 20:20 UTC). Packages were published and updated in waves across the weekend from multiple accounts. - Per-ecosystem execution paths: npm = a postinstall hook (shared payload
trap-core.js, validates AWS/GitHub tokens via API, and moves laterally with SSH keys); PyPI = on import, fetches JavaScript from the attacker’s GitHub Pages and runs it vianode -e; Crates.io = abuild.rsthat, at build time, XOR-encrypts keystores and exfiltrates them to GitHub Gists. - The distinctive technique (the focus of this Brief): AI-targeted injection into
.cursorrules/CLAUDE.md. These are legitimate files that give AI coding tools project-specific instructions, but the attacker plants directives hidden with zero-width Unicode characters that drive the AI assistant to perform “security scan” and similar tasks, steering it to discover and exfiltrate secrets. The attacker’s GitHub Pages site also functions as HTML that prompts AI assistants to run a scan. - Slipped into the legitimate contribution flow: The attacker account
ddjidd564submitted PRs adding.cursorrules/CLAUDE.md— under innocuous framing like “dev standards and build verification” — to real projects includingbrowser-use,langchain,langflow,llama_index,MetaGPT, andOpenHands. GitHub warned that those files contained hidden / bidirectional Unicode. - Attacker documents: The same repository holds
AUDIT-MATRIX.md(self-described “Universal AI Agent Extraction Framework”), which describes a “disguise layer” that makes credential theft look like innocuous tasks such as “security audit” and “wallet safety check.” Campaign markerP-2024-001. - The crux: There is no layer to independently verify — before the AI reads and executes them — on whose authorization and with what provenance the instruction files the AI obeys were placed.
Timeline
- 2026-05-22: The earliest observed package
eth-security-auditor@0.1.0is published to PyPI (20:20 UTC). From then, it spreads in waves to npm, PyPI, and Crates.io from multiple accounts. - 2026-05-24: Socket discloses TrapDoor, linking it as a single campaign from the overlap of cross-registry infrastructure and behavior.
- 2026-05 (same period): The attacker
ddjidd564submits PRs adding.cursorrules/CLAUDE.mdto real AI / developer-tooling projects. GitHub warns about hidden Unicode. - Ongoing: Socket classifies all identified packages as malicious, reports them to each registry, and continues to monitor related packages and infrastructure.
Note: The attacker document
AUDIT-MATRIX.mditself states it is “a design document that is only partially implemented,” so not everything described is necessarily live behavior. This text treats it only within the range consistent with the observed npm payload behavior, and asserts no unverified claims.
The attack path: how the AI assistant comes to execute TrapDoor’s instructions
This incident stems from a structure in which the provenance and authorization of the instruction files an AI obeys are not independently verified before execution. The path by which the failure propagates into exfiltration of secrets is as follows.
- Placing instructions with no provenance: The attacker places
.cursorrules/CLAUDE.mdin a repository — at the install time of a malicious package, or via a PR to a legitimate project. The contents include directives hidden with zero-width Unicode and, on screen, look like innocuous development guidelines. - Trusted as “project-specific instructions”: The AI coding assistant loads these files as the project’s legitimate instructions. It mistakes “being bundled in the repository” for a guarantee that the instructions originate from a legitimate author and legitimate authorization.
- Executing the disguised task: The hidden directives make the AI execute credential theft disguised as innocuous tasks like “security scan,” “wallet safety check,” or “cloud config verification” (corresponding to the attacker document’s “disguise layer”).
- Discovering and sending out secrets: The execution searches for and collects the secrets the development environment can reach (SSH keys, cloud/GitHub credentials, wallets, environment variables) and sends them to the attacker infrastructure (GitHub Pages / Gists, etc.). The npm payload validates the AWS/GitHub tokens via API and moves laterally with SSH keys.
- Detection and disablement: Once the malicious packages / PRs are detected, registry removal and PR rejection kick in. But this acts after the instructions could already be read and executed by the AI — an after-the-fact measure.
Structural analysis
This incident belongs to the code-provenance category of Pillar 01 (Verifiable Origin). The central failure primitive is that the provenance and authorization of the instruction files an AI assistant obeys (.cursorrules / CLAUDE.md) are not independently verified before the AI reads and executes them. “Being bundled in the repository” and “looking like project-specific instructions” are no guarantee that the instructions originate from a legitimate author and legitimate authorization. Concealment via zero-width Unicode makes what is shown on screen diverge from what the AI actually reads, so even a human review misses it. We note agent-infrastructure (the AI-assistant configuration as infrastructure) and ai-decision-integrity (the AI’s judgment steered by tampered instructions) as secondary categories.
The carriers of the supply-chain contamination (npm postinstall, PyPI import-time execution, Crates.io build.rs) are conventional, but what is new in this campaign is that it placed the endpoint of the carriage at “the instruction files an AI assistant reads.” As in Brief 037 (AI coding agents auto-executed bundled config without verification), it is the same mistake of “bundled == authorized,” but here the execution target is not config but natural-language instructions to the AI, and even when those instructions are tampered with, the AI obeys them as legitimate guidelines. It is the supply-chain version of Brief 018 (rewriting a repository’s CLAUDE.md to try to hijack a defending AI’s instructions), and it connects the concealment technique of Brief 024 (invisible Unicode making what a human sees diverge from the AI’s input) to the provenance problem of instruction files. It also connects to Brief 028 (a package spoofing an internal scope exploited the build environment’s provenance assumptions) in exploiting provenance assumptions. What this incident shows is that the AI development environment itself has become a target layer of supply-chain contamination — a direct line from the absence of provenance verification to real harm.
The gap between detection and proof
In this incident, the detection-and-remediation chain functioned — vendor research (Socket’s cross-registry detection, an average of just under 6 minutes to detect new versions), reporting to and removal from the registries, and GitHub’s warning about hidden Unicode plus PR rejection — and the techniques were made visible from the outside. This is a typical success of detection, and this Brief does not negate the role of the detection layer. Detection is indispensable for publishing the techniques, removing malicious packages, and scrutinizing the contribution flow.
At the same time, detection provides no material to independently establish — at the moment the AI reads and executes the instructions — whether the instruction file the AI is about to read is legitimately authorized and originates from a legitimate author. A registry scan sees only “is this package malicious,” and a PR review sees only “is this change reasonable.” Neither can distinguish, before the AI executes, the directives hidden in the instruction file with zero-width Unicode from the side of provenance. Removal and rejection, too, are after-the-fact chains that act once the instructions could already be read. This is a structurally independent layer gap, outside the reach of the detection layer.
Pre-execution attestation closes this gap by inserting one step — proof of the instructions’ provenance and authorization — into the path by which the AI assistant reads an instruction file and executes the task. Even when what is shown diverges from what is real, by binding the instructions / artifacts to their issuer (the legitimate author / distributor) and verifying provenance via a docHash, instructions tampered with via zero-width Unicode, or slipped into a legitimate project without authorization, can be distinguished before execution as “lacking legitimate provenance / authorization.” Detecting the surface plausibility of the instructions (the detection-style “does this content look reasonable”) and attesting the instructions’ provenance/authorization beforehand (the “do these instructions have a legitimate issuer / authorization”) are not substitutes but complements. For the idea of independently verifying provenance before execution, see “Proof-as-Auth: Sign in without ever sending your key” (Lemma, 2026-05); for the detection-and-attestation thesis, see “The last layer left for cyber defense in the age of AI” (Lemma, 2026-05).
Response and industry trends
- Vendors and platforms: Socket classified all identified packages as malicious, reported them to each registry, and continues to monitor related infrastructure. GitHub warned that the PRs’
.cursorrules/CLAUDE.mdcontained hidden / bidirectional Unicode, putting AI-instruction injection via the contribution flow on the detection radar. - The AI development environment question: AI instruction files like
.cursorrules/CLAUDE.mdwere re-recognized as deserving a trust boundary equal to code. A mechanism for an AI assistant to verify the provenance, authorization, and presence of hidden characters before it reads bundled instructions is raised as the challenge. - Cross-industry question: Supply-chain attacks, starting from “package install,” are spreading across the entire development workflow — AI assistant configuration, shell environments, Git hooks, SSH, browser profiles, cloud credentials, and wallets. The debate is shifting the center of gravity of AI dev-tool trust design toward not making the surface plausibility of instructions the endpoint of trust, but verifying the provenance and authorization of the instructions an AI obeys before execution (provenance / pre-execution attestation).
Lemma’s analysis
Against the gap this incident exposed (the provenance and authorization of the instruction files an AI obeys are decoupled from the AI’s execution), Lemma proposes a design that requires, before the AI assistant executes instructions / tasks, an independently verifiable cryptographic proof that the instructions are legitimately authorized and carry legitimate provenance.
- Binding instruction provenance: Bind the instruction files / artifacts to be executed to their issuer (the legitimate author / distributor) and verify provenance via a docHash. Make the divergence between what is shown and what is real — caused by zero-width Unicode — detectable before execution.
- Pre-action authorization proof (proof-as-auth): Before the AI performs instruction-driven tasks (searching for secrets, sending them externally, destructive operations), prove with a signature that “this task is authorized, with this scope, to this party.” Do not make “being bundled in the repository” the endpoint of authorization.
- Scoped authority: Minimize the authority given to the AI assistant per task, and do not let the collection / sending of secrets beyond the scope of authorization succeed without proof. Distinguish legitimate tasks from tasks driven by tampered instructions via the evidence trail.
- Selective disclosure: Disclose only the minimum — that “this task meets the authorization schema” — without letting internal keys or credentials leave the environment.
In this way, a proof fixed at the moment of execution functions as an independently verifiable trail of whether “these instructions are legitimately authorized and carry legitimate provenance,” before the AI executes them. Detection (after-the-fact removal of malicious packages, PR rejection) works on remediation after discovery; attestation (provenance / authorization verification before execution) works on the independent verification of AI instructions — each complementary to the other. For the design and its scope of application, see Pillar 01 — Verifiable Origin and Trust402.
Sources
- Socket (research, primary): “TrapDoor Crypto Stealer Supply Chain Attack Hits 34 Packages and Hundreds of Versions Across npm, PyPI, and Crates.io” (2026-05-24; per-ecosystem execution paths, AI injection, attacker documents, IOCs) — https://socket.dev/blog/trapdoor-crypto-stealer-npm-pypi-crates
- The Hacker News: “TrapDoor Supply Chain Attack Spreads Credential-Stealing Malware via npm, PyPI, and CratesIO” (2026-05; overview, targets) — https://thehackernews.com/2026/05/trapdoor-supply-chain-attack-spreads.html
- Phoenix Security: “TrapDoor Supply Chain Campaign: Cross-Ecosystem Credential Theft and AI Assistant Poisoning via npm, PyPI, and Crates.io” (2026-05; synthesis of the AI-assistant poisoning) — https://phoenix.security/trapdoor-supply-chain-ai-poisoning-npm-pypi-crates/
About Brief distribution
The Lemma Critical Brief is a threat-intelligence brief published by Lemma. This material is a structured analysis of public information; it is not an audit, diagnosis, or recommendation for any specific organization. If you use it as a reference for decision-making, please consult your Lemma Critical contact directly.
Discovery Call → Whitepaper → ✉️ Newsletter →
(c) 2026 FRAME00, INC. — Built for decisions that matter.