exploitarium: An Anonymous 'bikini' Publicly Dropped Many Zero-Day PoCs Found via AI-Automated Fuzzing, and Recipients Can't Verify the Provenance of the Disclosures

TL;DR

A researcher using the pseudonym “bikini” dropped zero-day PoCs targeting libssh2, curl, PHP, FFmpeg, Firefox, Ghidra, nmap, OpenVPN, VLC, RustDesk, and more into a GitHub repository, exploitarium, without notifying any vendor (public as of this Brief’s writing, with 3,000+ stars and about 26 PoC folders). In the repo’s README the researcher states that “the fuzzing was automated by AI (GPT-5.3-Codex-Spark), but the PoCs were hand-typed except for RustDesk, and this is good-faith open disclosure.” Among them, a pre-authentication remote code execution in libssh2 (CVE-2026-55200) is listed in NVD and independently verified as high-risk, and the findings were assigned CVE-2026-58049–58058. Others are of low impact, so defenders must sort real from low-value. The gap here: detection in the form of CVE assignment worked, yet a recipient cannot cryptographically verify whose discovery each report came from, by what method, and whether it is real. As AI breaks the cost of fuzzing — of finding — a mixed flood of reports pressures triage by sheer volume. This is not a new coinage but a concrete example of the already-discussed “Vulnpocalypse.” Detection and pre-execution proof are complements, not substitutes.

Incident overview

Publisher: The pseudonym “bikini.” The GitHub repository exploitarium. Public as of this Brief’s writing (3,000+ stars, 900+ forks, ~26 PoC folders, updated the same day). The Register reported it was “removed by GitHub,” but it is currently accessible (whether it was restored after removal is to be tracked). The researcher claims a degree and multiple fuzzing papers, and frames it as good-faith open disclosure
What was published (per the actual repo): About 26 self-contained PoC folders. Targets include 7-Zip, AnyDesk, c-ares, curl, Docker, FFmpeg, Firefox, Floci, Flowise, Ghidra, Gitea (act runner container options), ImageMagick, libarchive, libssh2 (two), Lunar/Modrinth, MyBB, Next.js, nghttp2, nmap, objdump, OpenVPN, PHP, RustDesk, SystemInformer, VLC, and more. Splunk is not included (this differs from The Register’s and others’ lists). The count was reported as “15 / 22 / 100+,” but the repo measures at about 26 PoCs
Manner of release: Dropped en masse without notifying any vendor and without seeking payment or credit (an anonymous mass release that does not follow responsible, coordinated disclosure)
Discovery method (the researcher’s README, primary): The fuzzing workflow was automated by AI (GPT-5.3-Codex-Spark). But the PoCs were hand-typed except for RustDesk (AI-assisted only for RustDesk), and only the READMEs are AI-generated. That is, not “AI generated the vulnerabilities” but AI-automated fuzzing plus human-written PoCs
Targets assigned CVEs and independently verified as high-risk:
- CVE-2026-55200 (libssh2): pre-authentication remote code execution (CVSS: NIST 8.3 High / VulnCheck 9.2 Critical). ssh2_transport_read() fails to enforce an upper bound on packet_length, so an SSH packet with an excessive value causes an out-of-bounds (heap) write and enables RCE. All versions through 1.11.1 are affected, and because libssh2 sits beneath curl, Git, PHP, and more, the blast radius is broad. The fix commit is merged to mainline, with a release in preparation. NVD’s exploitation status is PoC-available (SSVC: poc)
- The findings were assigned CVE-2026-58049–58058 (cves.md)
A provenance error to note (The Register’s conflation): Some reporting treated the Gitea authentication bypass CVE-2026-20896 (default REVERSE_PROXY_TRUSTED_PROXIES = *, takeover via the X-WEBAUTH-USER header, CVSS 9.8, fixed in Gitea 1.26.3) as coming from exploitarium. But the repo’s Gitea entry is gitea-act-runner-container-options-poc (act runner container options), which is different from CVE-2026-20896. CVE-2026-20896 was fixed via Gitea’s own separate security release and is not part of exploitarium
Core: Because the reports carry no proof of provenance and authenticity that a recipient can verify, recipients cannot cryptographically distinguish real high-risk findings from low-value ones, incurring the cost of routing every item to manual independent verification. When AI pushes the volume of findings past human verification capacity, this structure breaks

Timeline

Early 2026-06: Gitea fixes CVE-2026-20896 in its own security release (1.26.3 / 1.26.4, 6/20) — a path separate from exploitarium
Around 2026-06-23: bikini publishes PoCs to exploitarium without notifying vendors (a third-party detection-rule archive cites this as the release date)
2026-06-29: The Register reports on it, noting at least two high-risk vulnerabilities (though its handling of Gitea appears conflated; see §1)
2026-06-30: A Japanese outlet (Codebook) publishes a translation and analysis
2026-07-01 (as of this Brief’s writing): exploitarium is public and still being updated (the researcher announces “roughly one new PoC a day” going forward). Whether and when GitHub temporarily removed it is to be tracked

Note: This case initially centered on reporting about a “deleted” repository, but primary checking finds the repo public and directly inspectable. Independently corroborated are the existence and contents of exploitarium, the no-vendor-notification mass release, the researcher’s README statements (AI-automated fuzzing plus hand-typed PoCs; GPT-5.3-Codex-Spark), libssh2 CVE-2026-55200 (listed in NVD; pre-auth RCE; PoC available), and the assignment of CVE-2026-58049–58058 to the findings. Not independently confirmed / disputed / corrected are the total count (the reported “15 / 22 / 100+” conflicts with the repo’s ~26), the list of dropped items (Splunk is not included, and many targets were missing from reported lists), and the treatment of the Gitea auth bypass CVE-2026-20896 as coming from exploitarium (a conflation with a separate release). On exploitation, too, The Register reports “active exploitation” on the basis of Ethan Andrews’s observation, while NVD’s libssh2 is SSVC: poc (PoC-stage, unobserved) — the two do not agree. Refer to primary sources (the repo / NVD / each project’s official channels) for the latest on named entities, CVEs, and figures. We do not exaggerate.

Vector (a mass drop with no verifiable proof)

Mass discovery via AI-automated fuzzing: bikini efficiently finds many vulnerability candidates across products and OSS via a strict harness driven by AI (GPT-5.3-Codex-Spark) — the discovery is automated, though the researcher says the PoCs themselves were hand-written
Anonymous, unnotified mass release: about 26 PoCs are dropped into exploitarium without vendor notification, payment, or a request for credit
The provenance of discovery is not verifiable by the recipient: the README self-reports the method, but there is no proof a recipient can cryptographically verify for who discovered each report, by what method, and whether it is real. High-value and low-value (genuine and noise) are mixed together
Recipients are forced to verify everything: defenders and each project have no choice but to check the authenticity of the reports themselves, routing ~26 items into manual independent triage
Some are confirmed high-risk: independent verification confirms libssh2 (CVE-2026-55200) and others as high-risk, and the findings receive CVE-2026-58049–58058, while low-impact items are mixed in
Exploitable vulnerabilities spread as public information: exploitable items spread as public information (with PoCs) before patches are available, opening a pre-patch window

As AI breaks the cost of fuzzing — of finding — it can generate this “mixed flood of reports” in bulk, pressuring defenders’ triage bandwidth by volume. This is not a new coinage but a concrete example of what is already discussed as a Vulnpocalypse (Axios used the term in a May 2026 AI-vulnerability-discovery context).

Structural analysis

This case belongs to the code-provenance category under Pillar 01 (Verifiable Origin). But the object of provenance extends beyond code-as-artifact to the provenance and authenticity of the vulnerability disclosure (the report / claim) itself. The central failure primitive is that a vulnerability report is not bound, in a form the recipient can verify, to the provenance of its discovery — whose finding it was, by what method, and whether it is real. Detection in the form of a CVE assignment exists, but it is not a record a recipient can independently check that “this report came from a genuine discovery.” Even when the researcher self-reports the method in a README, that is not a proof a recipient can cryptographically verify. When AI breaks the cost of discovery, a mixed flood of reports pressures defenders’ verification bandwidth by volume.

This case connects with a series of Briefs in which AI’s offensive capability asymmetrically outpaces defenders’ capacity. Brief No.018 (HackerBot / Claw, an early AI-vs-AI exchange in which attacker automation broke the defensive premise) and Brief No.009 (GTG-1002, where an AI autonomously ran most of the attack chain) are the same shape as this case in the asymmetry of AI offensive capability. That a pseudonymous mass drop leaves the discoverer’s identity unverifiable to the recipient connects with the provenance / impersonation lineage of Brief No.082 (xz-utils, a backdoor introduced by impersonating a trusted developer) and Brief No.015 (compromise of internal repositories / extensions via GitHub). In all of them, the “origin and discovery provenance” of a claim or artifact was not independently verified by the recipient, yet trust or processing was forced.

As secondary categories, we note agent-runaway (AI’s runaway scaling of discovery pressuring defenders’ triage by volume) and identity-auth (the discoverer’s identity is a pseudonym, unverifiable to the recipient). Detection such as CVE assignment and independent verification matters as a precondition, but only once a recipient can verify in advance whether a report “carries a verifiable discovery provenance” can limited verification bandwidth be concentrated on the real high-risk findings.

The gap between detection and proof

That libssh2’s pre-auth RCE (CVE-2026-55200) is listed in NVD and independently verified as high-risk, that the findings were assigned CVE-2026-58049–58058, and that affected projects shipped fixes, is indispensable for grasping and remediating the harm, and this Brief does not deny that role. The detection, assignment, and patch process worked, at least for some of the targets. Detection did indeed work.

At the same time, a CVE assignment and after-the-fact independent verification are not material for a recipient to cryptographically establish, before the fact, whose discovery a report came from and whether it is real. In this case, because the published reports carried no discovery provenance a recipient could verify, defenders had no choice but to independently verify the whole mixed set of ~26 items themselves. What was missing was a layer letting a recipient independently verify beforehand “does this report carry a verifiable discovery provenance?” — a verification on a separate track from the after-the-fact CVE check. When AI pushes the volume of reports past human verification capacity, that cost does not scale.

Pre-execution attestation binds a vulnerability report to the provenance of its “discoverer, method, and target identity,” in a form a recipient can independently verify. If a proven report can be mechanically distinguished from an unproven anonymous drop, defenders can concentrate independent-verification bandwidth on the real high-risk findings and separate signal from the mix. Only by not separating the after-the-fact check (the detection-style “this did / did not exist”) from the pre-execution attestation of the report’s discovery provenance (“this report came from a verifiable discovery”), and by letting the two overlap, can defenders’ triage hold up in a world where AI breaks the cost of discovery. Detection and pre-execution proof are complements, not substitutes.

For the thesis that after-the-fact detection is not proof, see “The last layer left for cyber defense in the age of AI” (Lemma, 2026-05); for design that verifies independently before the action, see “Proof-as-Auth: sign in without ever sending your key” (Lemma, 2026-05).

Response and industry trends

Affected products: libssh2 merged a fix to mainline and is preparing a release; users of the broad set of components that sit atop libssh2 (curl, Git, PHP, and more) need to check patch status. Gitea’s CVE-2026-20896 (a separate auth bypass, not from this case) is fixed in 1.26.3. See the repo and each project’s official channels for the fix status of each target
The disclosure-process point: A no-vendor-notification mass release does not follow coordinated disclosure, so exploitable information spreads before patches are available. The researcher claims good-faith open disclosure, but the cost of verifying authenticity remains on the recipient
The adoption problem (the case’s central gap): those who would voluntarily attach a provenance proof are the responsible researchers who already practice coordinated disclosure; someone dropping anonymously and indiscriminately has no incentive to attach one. The actor who most needs the proof is the one least likely to use it. So the prescription resolves into one (or both) of two directions —
- A (enforced, supply-side): build a verifiable discovery provenance into the preconditions for CVE acceptance (advocate a CVE Board / CNA rule change). Enforceable, but with high political hurdles to standardization
- B (operational, receiver-side): institutionally lower the triage priority of reports lacking a provenance proof. No enforcement, but immediately implementable
Cross-industry point: As AI automates vulnerability discovery, the volume of reports outpaces human verification capacity, and defenders’ triage is pressured within a mix of genuine and noise. This is not the problem of one drop but a cross-ecosystem operational issue: vulnerability disclosure carries no authenticity a recipient can verify

Lemma’s analysis

Against the gap this case exposed (a recipient cannot independently verify in advance whether a vulnerability report carries a verifiable discovery provenance), Lemma proposes the following design. Note that this is less an individual incident than a concrete example of an era in which AI makes fuzzing — discovery — cheap, and it must be positioned with its unconfirmed facts and the self-reported nature of the researcher’s README in mind.

Provenance proof of the report’s discovery: Bind a vulnerability disclosure to the provenance of its “discoverer, method, and target identity” as a proof a recipient can independently verify. Give a verifiable origin not only to artifacts but to “reports and claims”
Separating unproven drops: Mechanically distinguish a proven report from an unproven anonymous mass drop, so defenders can concentrate independent-verification bandwidth on the real high-risk findings. This makes direction B (receiver-side triage deprioritization) operable on a cryptographic basis
Wiring into acceptance requirements: If direction A (making provenance a precondition for CVE acceptance) is taken, Lemma’s provenance proof can serve as the verifiable primitive a CNA can require
Selective disclosure: Without demanding full disclosure of the discovery method or internal details, prove with minimal disclosure only that “this report carries a verifiable discovery provenance”

Detection (CVE assignment, after-the-fact independent verification) works toward confirming the reality of an individual vulnerability, and pre-execution attestation (a recipient’s independent verification of a report’s discovery provenance) works toward separating signal from the mix; the two are complementary. In a world where AI breaks the cost of discovery, defenders survive not through faster detection but by making the authenticity of reports verifiable by the recipient. For the design and its scope, see Pillar 01 — Verifiable Origin and Seal.

Sources

exploitarium (primary — the repo itself): github.com/bikini/exploitarium (the PoC list, cves.md, and the README’s AI self-report) — https://github.com/bikini/exploitarium
The Register (research / reporting): “Anonymous researcher drops 0-day ‘exploitarium’ repo” (2026-06-29; its Gitea handling appears conflated) — https://www.theregister.com/security/2026/06/29/anonymous-researcher-drops-0-day-exploitarium-repo/5263961
NVD: CVE-2026-55200 (libssh2 pre-auth RCE; CVSS 8.3–9.2; SSVC: poc) — https://nvd.nist.gov/vuln/detail/CVE-2026-55200
oss-sec: CVE-2026-55200 (libssh2) thread — https://seclists.org/oss-sec/2026/q2/1010
Gitea: “Release of 1.26.3 and 1.26.4” (CVE-2026-20896; separate from exploitarium) — https://blog.gitea.com/release-of-1.26.3-and-1.26.4/
SC Media: “Anonymous researcher dumps zero-day exploits for multiple software products” — https://www.scworld.com/brief/anonymous-researcher-dumps-zero-day-exploits-for-multiple-software-products
Codebook (Japanese analysis): “An anonymous researcher published several zero-days in a GitHub repository” (2026-06-30) — https://codebook.machinarecord.com/threatreport/silobreaker-cyber-alert/46394/
(Reference — existing Briefs) No.018 HackerBot / Claw (AI vs. AI) — https://lemma.frame00.com/critical/briefs/018-hackerbot-claw-ai-vs-ai/ ; No.009 GTG-1002 — https://lemma.frame00.com/critical/briefs/009-gtg1002-ai-orchestrated-espionage/

About distribution

This material is a structured analysis of public information; it is not an audit, diagnosis, or recommendation for any specific organization.