Anonymization: The imperfect science of using data while preserving privacy
Citations Over TimeTop 1% of 2024 papers
Abstract
Information about us, our actions, and our preferences is created at scale through surveys or scientific studies or as a result of our interaction with digital devices such as smartphones and fitness trackers. The ability to safely share and analyze such data is key for scientific and societal progress. Anonymization is considered by scientists and policy-makers as one of the main ways to share data while minimizing privacy risks. In this review, we offer a pragmatic perspective on the modern literature on privacy attacks and anonymization techniques. We discuss traditional de-identification techniques and their strong limitations in the age of big data. We then turn our attention to modern approaches to share anonymous aggregate data, such as data query systems, synthetic data, and differential privacy. We find that, although no perfect solution exists, applying modern techniques while auditing their guarantees against attacks is the best approach to safely use and share data today.
Related Papers
- → Enhancing data utility in differential privacy via microaggregation-based $$k$$ k -anonymity(2014)146 cited
- → Improving the Utility of Differentially Private Data Releases via k-Anonymity(2013)38 cited
- → dK-Microaggregation: Anonymizing Graphs with Differential Privacy Guarantees(2020)16 cited
- → Data Aggregation Approaches in WSNs(2021)8 cited
- Analysis of Data Aggregation Techniques in Wireless Sensor Networks(2013)