RAG Source Attestation

Cited ≠ verified

When an enterprise AI responds 'Based on Internal Policy Section X,' there is no guarantee the cited source matches the unaltered original. RAG indexes are rebuilt regularly, documents are versioned, and embeddings drift. Lemma proves that each citation in an AI response is cryptographically bound to the exact document version it claims to reference.

P2 Verifiable AI Legal tech · Enterprise knowledge management · Financial compliance 8 min read

Problem

AI agents have developed a habit of citing sources in their responses. "Pursuant to Internal Policy Article 3, Section 2," "According to National Tax Agency Notice X," "Article 7 of the contract states" — such citations underpin the persuasiveness of responses and are prerequisites for operational use.

But recipients have no means to verify whether a citation is authentic. From the response text alone, three categories are indistinguishable:

  • Correctly cited from an existing original
  • Paraphrased from an existing original, with subtly different wording
  • Entirely fabricated citation (hallucination)

Additional structural problems:

  • There is no cryptographic proof mapping which part of the response text corresponds to which retrieved chunk
  • Whether the retrieved chunks match the originals is ambiguous unless guaranteed by a separate layer
  • As the RAG index is rebuilt over time, the chunks that past responses referenced disappear

Legal, tax, healthcare, financial, and consulting — in every one of these domains, decisions based on incorrect citations escalate into serious liability issues. Retrospective verifiability of responses is not an operational option; it is a precondition for business continuity.

Scenario

A major tax advisory firm's AI agent provides tax guidance to corporate clients, referencing tax law, notices, and past cases. The AI responds with explicit citations: "According to National Tax Agency Notice X, Article 3, the expenditure is deductible as a business expense."

In August 2026, a corporate client proceeds with accounting treatment based on the AI's response. During a post-settlement tax audit, the auditor states: "That notice contains no such provision." The client demands an explanation from the tax firm.

The firm checks the AI response logs. What remains is the response text and a source attribution "From Notice X." Whether the chunks the AI actually retrieved at the time matched the original Notice X, whether the response text accurately cited those chunks — none of this is retrospectively verifiable. The RAG index has been rebuilt since then due to additional notices; the historical state is irreproducible.

With Lemma, the following would have been sealed at the moment the response was generated:

  • AI agent identifier and response timestamp
  • Hash of the retrieved chunk set
  • Cryptographic binding between those chunks and the originals (already fixed by RAG Content Provenance)
  • Cryptographic mapping between citation spans in the response text and the corresponding chunks

The client and the tax firm can independently verify: "The Notice X, Article 3 cited in the August 18, 2026 AI response corresponds to this passage in the original at that time, and the response text is a verbatim citation from it." Or conversely: "The citation in the log does not exist in the original at that time" — this, too, can be cryptographically established.

Accountability is determined by proof, not speculation.

Architecture

Lemma's four cryptographic layers correspond to the AI response lifecycle.

1. ENCRYPT — Sealing Citation Elements at Response Generation

At the moment the AI agent generates a response, the retrieved chunks, the generated response text, and the citation spans within the response are encrypted with AES-GCM. Correspondences between elements are recorded without ever exposing the originals in plaintext.

2. PROVE — Cryptographic Binding of Citations

On a ZK circuit, the integrity of four elements is sealed as a proof: (a) AI agent identifier, (b) hash of retrieved chunks, (c) original docHash (already fixed by RAG Content Provenance), (d) citation spans in the response text. "What the AI saw and where it cited from" is proven end-to-end.

3. DISCLOSE — Context-Specific Selective Disclosure

Disclosure scope is controlled by the recipient's authority. The client receives citation locations and proof-of-existence for the original; the regulator receives full chunks and retrieval history; a third-party auditor receives only a proof of citation integrity — all enforced with issuer signatures.

4. PROVENANCE — Permanent Record of the Response Itself

Response timestamp, AI agent identifier, cited source docHash, and response hash are anchored on-chain. Even if the RAG index is rebuilt or the AI agent is replaced, the citation integrity of past responses remains permanently verifiable.

┌──────────────────────────────────────────────────────────┐
│  AI Agent Response Generation                             │
│  Query → retrieve → chunk selection → response text       │
└───────────────────────┬──────────────────────────────────┘
                        │ Response + citation spans + retrieved chunks

┌──────────────────────────────────────────────────────────┐
│  ENCRYPT (AES-GCM)                                       │
│  • Encrypt retrieved chunk set                            │
│  • Seal citation spans in response text                   │
│  → Record element correspondences without plaintext       │
└───────────────────────┬──────────────────────────────────┘
                        │ Encrypted citation elements

┌──────────────────────────────────────────────────────────┐
│  PROVE (ZK Circuit)                                      │
│  Binding: (a) AI agent identifier (b) chunk hashes        │
│           (c) original docHash (fixed by provenance)       │
│           (d) citation spans in response text              │
│  → Proves end-to-end "what AI saw and where it cited"     │
└───────────────────────┬──────────────────────────────────┘
                        │ ZK proof + citation binding

┌──────────────────────────────────────────────────────────┐
│  DISCLOSE (Selective Disclosure)                          │
│  Client → citation location + proof-of-existence          │
│  Regulator → full chunks + retrieval history              │
│  Third-party auditor → citation integrity proof only      │
└───────────────────────┬──────────────────────────────────┘
                        │ Disclosed attributes

┌──────────────────────────────────────────────────────────┐
│  PROVENANCE (On-chain)                                   │
│  Response timestamp / AI identifier / source docHash      │
│  / response hash                                         │
│  → Citation integrity immutable after RAG rebuild/agent   │
│    replacement                                           │
└──────────────────────────────────────────────────────────┘

Proven Facts

Lemma cryptographically guarantees the following facts in RAG source attestation:

  • Response generation timestamp and AI agent identifier
  • Hash of the retrieved chunk set
  • Cryptographic binding between cited original (docHash) and citation spans in the response
  • Character-level consistency between citation text and original
  • Absence of hallucinations (citations that do not exist in the original)
  • Citation integrity of past responses, immutable after RAG rebuilds
  • Citation authenticity independently verifiable by third parties without disclosing originals
Get Started

Ready to prove?

Talk to us about your use case. We respond within one business day.