ContextLeak: Auditing Leakage in Private In-Context Learning

ContextLeak is an auditing framework for empirically measuring information leakage in private in-context learning methods.

The project studies whether sensitive information contained in in-context examples can be leaked through model outputs, even when privacy-preserving mechanisms or heuristic defenses are applied. We use canary insertion and targeted adversarial queries to evaluate leakage across different mechanisms and model settings.

Keywords: Large language models, privacy auditing, in-context learning, differential privacy, trustworthy AI

Links: arXiv