An information theorist's tour of differential privacy
Sarwate, Calmon, Kosut et al.
Since being proposed in 2006, differential privacy has become a standard method for quantifying certain risks in publishing or sharing analyses of sensitive data. At its heart, differential privacy measures risk in terms of the differences between probability distributions, which is a central topic in information theory. A differentially private algorithm is a channel between the underlying data and the output of the analysis. Seen in this way, the guarantees made by differential privacy can be understood in terms of properties of this channel. In this article we examine a few of the key connections between information theory and the formulation/application of differential privacy, giving an ``operational significance'' for relevant information measures.
academic
An information theorist's tour of differential privacy
Since its introduction in 2006, differential privacy has become the standard methodology for quantifying certain risks in sensitive data publication or shared analysis. At its core, differential privacy measures risk through divergences between probability distributions, a central theme in information theory. Differential privacy algorithms constitute a channel between underlying data and analytical outputs. From this perspective, the guarantees provided by differential privacy can be understood through the properties of this channel. This paper investigates several key connections between information theory and the formulation/application of differential privacy, providing "operational significance" for relevant information measures.
Privacy Protection Requirements: With the advent of the big data era, how to publish useful data analysis results while protecting individual privacy has become a critical challenge
Lack of Theoretical Foundation: Existing privacy protection methods lack rigorous theoretical foundations and operational methods for quantifying risks
Cross-disciplinary Connections: Deep connections exist between differential privacy and information theory, but systematic theoretical analysis is lacking
Theoretical Framework Establishment: Systematically elucidates the connections between differential privacy and information theory, viewing differential privacy algorithms as channels
Hypothesis Testing Perspective: Reinterprets differential privacy definitions from a hypothesis testing viewpoint, providing operational understanding
f-Divergence Theory Application: In-depth analysis of the relationship between f-divergence and differential privacy, particularly hockey-stick divergence
Privacy Accounting Methods: Summarizes composition analysis methods based on privacy loss distribution (PLD)
Mechanism Optimization Theory: Provides an information-theoretic framework and concrete algorithms for differential privacy mechanism optimization
This paper provides a comprehensive information-theoretic perspective on differential privacy and represents an important theoretical contribution to the field. By viewing differential privacy algorithms as channels, the authors successfully apply information-theoretic tools to analyze and optimize privacy mechanisms, providing valuable insights for both theoretical research and practical applications.