RAG Content Provenance
Anchor RAG documents to verifiable provenance at ingest — docHash, CID, issuer signature. Citation authenticity becomes cryptographically traceable.
Three voices from the front line.
- IT / business DX
“We need to lock a document's provenance at the moment it's ingested into RAG”
- Governance
“We want to verify later that a document is untampered and who issued it”
- AI engineering
“We want document versioning and tamper detection operated as one”
Hand over the source, or just the facts?
Change what reaches the AI, and the leakage risk goes with it.
- doc_path:
- /shared/work-rules.pdf
- uploaded_by:
- user-123
- content:
- body…
- version:
- untracked
- signed:
- none
- subject:
- did:lemma:doc-policy-v3
- issuer:
- did:lemma:docs.internal
- sourceHash:
- 0x4f8a…
- lineageChain:
- [upload, index, embed]
- integrity:
- poseidon-merkle
- ZK verified:
- ✓ VALID
At the moment an internal document is ingested into RAG, the original is encrypted and its fingerprint (sourceHash), issuer signature and valid version are inscribed on the index side. What the AI retrieves is not the original itself but only facts that carry provenance. A cited sentence can be traced through the inscribed fingerprint to the version it came from, and once revised, citations of the old version are structurally detected.
See the technical details ↗Choose on three criteria.
Only work that needs all three at once — pass without exposing, independent verification, tamper-proof — is Lemma's domain.
| Method | Pass without exposing | Independent verification | Tamper-proof |
|---|---|---|---|
| Access control only | △ | ✗ | ✗ |
| Masking / anonymization | △ | ✗ | ✗ |
| Encryption only | ✓ | ✗ | ✗ |
| Lemma (ZK proof)the only one with all 3 | ✓ | ✓ | ✓ |
How it works
Tell us how your RAG pipeline is wired today, which document classes flow through it, and where citation accuracy is hurting most. We'll explore together whether Lemma's provenance layer could fit. No source documents or index internals required.
Related Use Cases
The bigger picture
The bigger picture this use case belongs to.
We map use scenarios across industries and workflows by the four axes.
See use scenarios for Verifiable Origin in Solutions →TRY LEMMA
Run it yourself.
No sales call needed — start hands-on with Lemma's products.