SPARK-Based Partitioning Algorithm for k-Anonymization of Large RDFs

Temuujin, Odsuren; Jeon, Minhyuk; Seo, Kwangwon; Ahn, Jinhyun; Im, Dong-Hyuk

doi:10.1007/978-981-32-9244-4_41

Odsuren Temuujin³⁸,
Minhyuk Jeon³⁸,
Kwangwon Seo³⁸,
Jinhyun Ahn³⁹ &
…
Dong-Hyuk Im³⁸

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 590))

Included in the following conference series:

971 Accesses

Abstract

Privacy protection for resource description framework data is very important because RDF (i.e., linked data) is widely used in published data format in many areas, including government open data, health-care for individuals, and social relationships. As data can include private information belonging to individuals or companies and can make private information available to third parties, there are several anonymization models provided for preserving privacy in practice. k-anonymity has thus gained attention in research. Recently, several RDF anonymization models have been proposed. However, current approaches focus on a model and a metric for measuring information loss but do not consider large-scale RDF data. In this paper, we propose an efficient anonymizing method for large-scale RDF data. We develop a greedy partitioning algorithm (i.e., SPARK) for RDF anonymization. SPARK is a leading platform for big data processing. The results of experiments on synthetic datasets demonstrate that our proposed method requires less running time than previous methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 149.00; Price excludes VAT (USA)

Softcover Book: USD 199.99; Price excludes VAT (USA)

Hardcover Book: USD 249.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertainty Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)
Article MathSciNet MATH Google Scholar
Machanavajjhala, A., Gehrke, J., Kifer, D., Venkitasubramaniam, M.: l-diversity: privacy beyond k-anonymity. In: ICDE 2006, p. 24. IEEE (2006)
Google Scholar
Radulovic, F., Garcia Castro, R., Gomez-Perez, A.: Towards the anonymization of RDF data (2015)
Google Scholar
Heitmann, B., Hermsen, F., Decker, S.: k-RDF-neighbourhood anonymity: combining structural and attribute-based anonymization for linked data. In: PrivOn@ ISWC (2017)
Google Scholar
Li, N., Li, T., Venkatasubramanian, S.: t-closeness: privacy beyond k-anonymity and l-diversity. In: IEEE 23rd International Conference on Data Engineering, ICDE 2007, pp. 106–115. IEEE (2007)
Google Scholar
Zaharia, M., Xin, R.S., Wendell, P., Das, T., Armbrust, M., Dave, A., Meng, X., Rosen, J., Venkataraman, S., Franklin, M.J., et al.: Apache spark: a unified engine for big data processing. Commun. ACM 59(11), 56–65 (2016)
Article Google Scholar
LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: Proceedings of the 22nd International Conference on Data Engineering, ICDE 2006, p. 25. IEEE (2006)
Google Scholar
Bayardo, R.J., Agrawal, R.: Data privacy through optimal k-anonymization. In: 21st International Conference on Data Engineering, ICDE 2005, Proceedings, pp. 217–228. IEEE (2005)
Google Scholar

Download references

Acknowledgement

This work was supported by the MSIT (Ministry of Science and ICT), Korea, under the ITRC (Information Technology Research Center) support program (IITP-2018-01417) supervised by the IITP (Institute for Information & Communications Technology Promotion) and IITP grant funded by the Korea government (MSIP) (No. R0113-15-0005, Development of a Unified Data Engineering Technology for Largescale Transaction Processing and Real-Time Complex Analytics) and Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (No. NRF-2018R1D1A1B07048380).

Author information

Authors and Affiliations

Department of Computer Engineering, Hoseo University, Asan, South Korea
Odsuren Temuujin, Minhyuk Jeon, Kwangwon Seo & Dong-Hyuk Im
Department of Management Information Systems, Jeju National University, Jeju, South Korea
Jinhyun Ahn

Authors

Odsuren Temuujin
View author publications
You can also search for this author in PubMed Google Scholar
Minhyuk Jeon
View author publications
You can also search for this author in PubMed Google Scholar
Kwangwon Seo
View author publications
You can also search for this author in PubMed Google Scholar
Jinhyun Ahn
View author publications
You can also search for this author in PubMed Google Scholar
Dong-Hyuk Im
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dong-Hyuk Im .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, Seoul National University of Science and Technology, Seoul, Korea (Republic of)
James J. Park
Department of Computer Science, St. Francis Xavier University, Antigonish, NS, Canada
Laurence T. Yang
Department of Multimedia Engineering, Dongguk University, Seoul, Korea (Republic of)
Young-Sik Jeong
School of Computer Science, Shaanxi Normal University, Xi'an, China
Fei Hao

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Temuujin, O., Jeon, M., Seo, K., Ahn, J., Im, DH. (2020). SPARK-Based Partitioning Algorithm for k-Anonymization of Large RDFs. In: Park, J., Yang, L., Jeong, YS., Hao, F. (eds) Advanced Multimedia and Ubiquitous Engineering. MUE FutureTech 2019 2019. Lecture Notes in Electrical Engineering, vol 590. Springer, Singapore. https://doi.org/10.1007/978-981-32-9244-4_41

Download citation

DOI: https://doi.org/10.1007/978-981-32-9244-4_41
Published: 22 August 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-32-9243-7
Online ISBN: 978-981-32-9244-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics