Definition
Scope varies by jurisdiction. GDPR uses personal data, CCPA personal information, Japan's APPI 「個人情報」; the definitions don't line up exactly. The common floor includes name, address, contact, identifier numbers (national ID, PPID, PAN, ...), biometrics, device identifiers, location, communication records, transaction history. The EU draws an additional perimeter around special-category data (sensitive PII — health, race, religion, biometric, sexual orientation, and so on) and demands additional protection.
Regulators put two loads on the operator at once: collection minimization and storage-time protection. The two are in tension — strengthening the latter doesn't erase the breach surface as long as the original data is held. The structural risk is that stored original PII remains both attack surface and regulatory exposure indefinitely.
Adjacent vocabulary: PHI (Protected Health Information, HIPAA) for healthcare, PAN (Primary Account Number, PCI DSS) for payments. This entry treats PII as the parent concept; the adjacent regulations carry their own surface and live in their own entries.
Lemma implementation
Reframe the operator out of holding original PII, and let only the required attribute travel as a proof. The issuer signs the original; the holder discloses predicates ("over 18", "KYC passed", "Japan resident", ...) via selective disclosure; the verifier sees only the attribute. Three layers; no original in motion.
The original stays under the issuer; it never reaches the verifier, the recipient, or the AI inference path. It sits encrypted under AES-GCM; what touches the circuit is the docHash and the attribute commitment. When a breach happens, the surface that can leak is structurally smaller.
The pattern fits anywhere a regulation simultaneously demands attribute verification and data minimization — KYC/AML, age checks, residency proofs — and shows up as an alternative path for EU AI Act high-risk inputs, GDPR Article 8, and Japan's "anonymized information" pattern under APPI.