Randomization Methods to Ensure Data Privacy

Machanavajjhala, Ashwin; Gehrke, Johannes

doi:10.1007/978-0-387-39940-9_301

Ashwin Machanavajjhala³ &
Johannes Gehrke³

330 Accesses
1 Altmetric

Synonyms

Perturbation techniques

Definition

Many organizations, e.g., government statistical offices and search engine companies, collect potentially sensitive information regarding individuals either to publish this data for research, or in return for useful services. While some data collection organizations, like the census, are legally required not to breach the privacy of the individuals, other data collection organizations may not be trusted to uphold privacy. Hence, if U denotes the original data containing sensitive information about a set of individuals, then an untrusted data collector or researcher should only have access to an anonymized version of the data, U*, that does not disclose the sensitive information about the individuals. A randomized anonymization algorithm R is said to be a privacy preserving randomization method if for every table T, and for every output T* = R(T), the privacy of all the sensitive information of each individual in the original data is provably...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 2,500.00; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Adam N.R. and Wortmann J.C. Security-control methods for statistical databases: a comparative study. ACM Comput. Surv., 21(4):515–556, 1989.
Google Scholar
Agrawal R. and Srikant R. 2000.Privacy preserving data mining. In Proc. ACM SIGMOD Int. Conf. on Management of Data, pp. 439–450.
Google Scholar
Agrawal S. and Haritsa J.R. A framework for high-accuracy privacy-preserving mining. In Proc. 21st Int. Conf. on Data Engineering, 2005, pp. 193–204.
Google Scholar
Barak B., Chaudhuri K., Dwork C., Kale S., McSherry F., and Talwar K. Privacy, accuracy and consistency too: a holistic solution to contingency table release. In Proc. 26th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 2007.
Google Scholar
Blum A., Dwork C., McSherry F., and Nissim K. Practical privacy: the SuLQ framework. In Proc. 24th ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 2005, pp. 128–138.
Google Scholar
Dwork C., McSherry F., Nissim K., and Smith A. Calibrating noise to sensitivity in private data analysis. In Proc. 3rd Theory of Cryptography Conf., 2006, pp. 265–284.
Google Scholar
Evfimievski A., Gehrke J., and Srikant R. Limiting privacy breaches in privacy preserving data mining. In Proc. 22nd ACM SIGACT-SIGMOD-SIGART Symp. on Principles of Database Systems, 2003, pp. 211–222.
Google Scholar
Evfimievsky A., Srikant R., Gehrke J., and Agrawal R. Privacy preserving data mining of association rules. In Proc. 8th ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining, 2002, pp. 217–228.
Google Scholar
Huang Z., Du W., and Chen B. Deriving private information from randomized data. In Proc. 23th ACM SIGMOD Conf. on Management of Data, 2004.
Google Scholar
Kargupta H., Datta S., Wang Q., and Sivakumar K. On the privacy preserving properties of random data perturbation techniques. In Proc. 2003 IEEE Int. Conf. on Data Mining, 2003, pp. 99–106.
Google Scholar
Kifer D. and Gehrke J. Injecting utility into anonymized datasets. In Proc. ACM SIGMOD Int. Conf. on Management of Data, 2006.
Google Scholar
Machanavajjhala A., Kifer D., Abowd J., Gehrke J., and Vihuber L. Privacy: from theory to practice on the map. In Proc. 24th Int. Conf. on Data Engineering, 2008.
Google Scholar
On The Map (Version 2) http://lehdmap2.dsd.census.gov/.
Rastogi V., Suciu D., and Hong S. The Boundary Between Privacy and Utility in Data Publishing. Tech. rep., University of Washington, 2007.
Google Scholar
Reiter J. Estimating risks of identification disclosure for microdata. J. Am. Stat. Assoc., 100:1103–1113, 2005.
MATH MathSciNet Google Scholar
Rubin D.B. Discussion statistical disclosure limitation. J. Off. Stat., 9(2):461–468, 1993.
Google Scholar
Warner S.L. Randomized response: a survey technique for eliminating evasive answer bias. J. Am. Stat. Assoc., 60(309):63–69, 1965.
Google Scholar

Download references

Author information

Authors and Affiliations

Cornell University, Ithaca, NY, USA
Ashwin Machanavajjhala & Johannes Gehrke

Authors

Ashwin Machanavajjhala
View author publications
You can also search for this author in PubMed Google Scholar
Johannes Gehrke
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

College of Computing, Georgia Institute of Technology, 266 Ferst Drive, 30332-0765, Atlanta, GA, USA
LING LIU (Professor) (Professor)
Database Research Group David R. Cheriton School of Computer Science, University of Waterloo, 200 University Avenue West, N2L 3G1, Waterloo, ON, Canada
M. TAMER ÖZSU (Professor and Director, University Research Chair) (Professor and Director, University Research Chair)

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Machanavajjhala, A., Gehrke, J. (2009). Randomization Methods to Ensure Data Privacy. In: LIU, L., ÖZSU, M.T. (eds) Encyclopedia of Database Systems. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-39940-9_301

Download citation

DOI: https://doi.org/10.1007/978-0-387-39940-9_301
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-35544-3
Online ISBN: 978-0-387-39940-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics