Lemma Critical Brief · Category archive
Training Data Provenance
Structures where the origin and use-scope attributes of AI training data flow downstream without independent verification — public-API scraping of chat platforms, dataset distribution outside the licensed scope, gaps in AI training-data audit layers.
1 Brief
Discord 2.05 Billion Message Scraping via Public API
How Public Channel Data Gets Redistributed as AI Training Datasets