Skip to main content

A Neural-Network Clustering-Based Algorithm for Privacy Preserving Data Mining

  • Conference paper
Grid and Distributed Computing, Control and Automation (GDC 2010, CA 2010)

Abstract

The increasing use of fast and efficient data mining algorithms in huge collections of personal data, facilitated through the exponential growth of technology, in particular in the field of electronic data storage media and processing power, has raised serious ethical, philosophical and legal issues related to privacy protection. To cope with these concerns, several privacy preserving methodologies have been proposed, classified in two categories, methodologies that aim at protecting the sensitive data and those that aim at protecting the mining results. In our work, we focus on sensitive data protection and compare existing techniques according to their anonymity degree achieved, the information loss suffered and their performance characteristics. The ℓ-diversity principle is combined with k-anonymity concepts, so that background information can not be exploited to successfully attack the privacy of data subjects data refer to. Based on Kohonen Self Organizing Feature Maps (SOMs), we firstly organize data sets in subspaces according to their information theoretical distance to each other, then create the most relevant classes paying special attention to rare sensitive attribute values, and finally generalize attribute values to the minimum extend required so that both the data disclosure probability and the information loss are possibly kept negligible. Furthermore, we propose information theoretical measures for assessing the anonymity degree achieved and empirical tests to demonstrate it.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Gkoulalas-Divanis, A., Verykios, V.S.: An Overview of Privacy Preserving Data Mining. Crossroads archive 15(4), Article No. 6 (June 2009)

    Google Scholar 

  2. Liu, Y., Lv, D., Wang, C., Feng, J., Deng, Q., Ye, Y.: BSGI An Effective Algorithm towards Stronger l-Diversity. In: Yu Liu, D.L. (ed.) Applications table of contents Turin, Italy, pp. 19–32 (2008) (Data Privacy table of contents)

    Google Scholar 

  3. Machanavajjhala, A., Kifer, D., Gehrke, J., Venkitasubramaniam, M.: L-Diversity: Privacy Beyond k-Anonymity. ACM Transactions on Knowledge Discovery from Data 1(1), Article 3, 1:52 (2007)

    Google Scholar 

  4. UCI, Irvin Machine Learning Repository

    Google Scholar 

  5. Xu, J., Wang, W., Pei, J., Wang, X., Shi, B., Fu, A.W.-C.: Utility-Based Anonymization Using Local Recoding (2006)

    Google Scholar 

  6. Bayardo, R., Agrawal, R.: Data privacy through optimal k-anonymization. In: Proceedings on 21st International Conference (2005)

    Google Scholar 

  7. Webb, G.I.: Opus:An Effcient Admissible Algorithm for Unordered Search (1995)

    Google Scholar 

  8. Rymon, R.: Search Through Systematic Set Enumeration (1992)

    Google Scholar 

  9. Iyengar, V.S.: Transforming Data to Satisfy Privacy Constrains (2002)

    Google Scholar 

  10. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression (2002)

    Google Scholar 

  11. Whitley, D.: The Genitor Algorithm and Selective Pressure: Why rank-based allocation of reproductive trials is best. In: Proceedings of Third International Conference on Genetic Algorithms, pp. 116–121 (1989)

    Google Scholar 

  12. Xiao, X., Tao, Y.: Anatomy: Simple and effective privacy preservation. In: VLDB, pp. 139–150 (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Tsiafoulis, S., Zorkadis, V.C., Karras, D.A. (2010). A Neural-Network Clustering-Based Algorithm for Privacy Preserving Data Mining. In: Kim, Th., Yau, S.S., Gervasi, O., Kang, BH., Stoica, A., Ślęzak, D. (eds) Grid and Distributed Computing, Control and Automation. GDC CA 2010 2010. Communications in Computer and Information Science, vol 121. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-17625-8_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-17625-8_27

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-17624-1

  • Online ISBN: 978-3-642-17625-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics