Abstract
In this paper we study the problem of anonymity in multi-instance (MI) micro-data publication. The classical k-anonymity approach is shown to be insufficient and/or inappropriate for MI databases. Thus, it is extended to MI databases, resulting in a more general setting of MI k-anonymity. We show that MI k-anonymity problem is NP-Hard and the attack model for MI databases is different from that of single-instance databases. We make an observation that the introduced MI k-anonymity is not a strong privacy guarantee when anonymity sets are highly unbalanced with respect to instance counts. To this end a new anonymity principle, called p-certainty, which is unique to MI case is introduced. A clustering algorithms solving the p-certainty anonymity principle is developed and experimentally evaluated.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Abul O, Bonchi F, Nanni M (2008) Never walk alone: uncertainty for anonymity in moving objects databases. In: Proceedings of 24th IEEE international conference on data, engineering (ICDE’08)
Adam NR, Wortmann JC (1989) Security-control methods for statistical databases: a comparative study. ACM Comput Surv 21(4):515–556
Aggarwal G, Feder T, Kenthapadi K, Khuller S, Panigrahy R, Thomas D, Zhu A (2006) Achieving anonymity via clustering. In: Proceedings of 25rd ACM symposium on principles of database systems (PODS’06)
Aggarwal G, Feder T, Kenthapadi K, Motwani R, Panigrahy R, Thomas D, Zhu A (2005) Anonymizing tables. In: Proceedings of 10th international conference on database theory (ICDT’05)
Agrawal D, Aggarwal CC (2001) On the design and quantification of privacy preserving data mining algorithms. In: Proceedings of 20th ACM symposium on principles of database systems (PODS’01), pp 247–255
Agrawal R, Srikant R (2000) Privacy-preserving data mining. In: Proceedings of 2000 ACM SIGMOD international conference on management of data (SIGMOD’00), pp 439–450
Domingo-Ferrer J, Mateo-Sanz JM (2002) Practical data-oriented microaggregation for statistical disclosure control. IEEE Trans Knowl Data Eng 14(1):189–201
Garey MR, Johson DS (1979) Computers and intractability: a guide to the theory of NP-completeness. Freeman, New York
Kohavi R (1996) Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid. In: Proceedings of 2nd international conference on knowledge discovery and data mining (KDD’96)
Kriegel H-P, Pryakhin A, Schubert M (2006) An EM approach for clustering multi-instance objects. In: Proceedings of 10th Pacific-Asia conference on knowledge discovery and data mining (PAKDD’06)
Kwok JT, Cheung P-M (2007) Marginalized multi-instance kernels. In: Proceedings of 20th international joint conference on artificial intelligence (IJCAI’07)
LeFevre K, DeWitt DJ, Ramakrishnan R (2005) Incognito: efficient full-domain k-anonymity. In: Proceedings of 2005 ACM SIGMOD international conference on management of data (SIGMOD’05), pp 49–60
LeFevre K, DeWitt DJ, Ramakrishnan R (2006) Mondrian multidimensional k-anonymity. In: Proceedings of 22nd IEEE international conference on data, engineering (ICDE’06)
Li J, Wong RC-W, Fu AW-C, Pei J (2006) Achieving k-anonymity by clustering in attribute hierarchical structures. In: Proceedings of 8th international conference on data warehousing and knowledge, discovery (DaWaK’06)
Li N, Li T (2007) \(t\)-closeness: privacy beyond k-anonymity and l-diversity. In: Proceedings of 23rd IEEE international conference on data, engineering (ICDE’07)
Machanavajjhala A, Gehrke J, Kifer D, Venkitasubramaniam M (2006) \(l\)-diversity: privacy beyond \(k\)-anonymity. In: Proceedings of 22nd IEEE international conference on data, engineering (ICDE’06)
Martin DJ, Kifer D, Machanavajjhala A, Gehrke J (2007) Worst-case background knowledge for privacy-preserving data publishing. In: Proceedings of 23rd IEEE international conference on data engineering (ICDE’07)
Meyerson A, Willliams R (2004) On the complexity of optimal k-anonymity. In: Proceedings of the 23rd ACM symposium on principles of database systems (PODS’04)
Nergiz M, Clifton C, Nergiz A (2007) Multirelational k-anonymity. In: Proceedings of data engineering, 2007. ICDE 2007, IEEE 23rd international conference on, pp 1417–1421
O’Leary DE (1991) Knowledge discovery as a threat to database security. In Piatetsky-Shapiro G, Frawley WJ (eds) Knowledge discovery in databases. AAAI/MIT Press, Cambridge, pp 507–516
Samarati P, Sweeney L (1998) Generalizing data to provide anonymity when disclosing information (abstract). In: Proceedings of 17th ACM symposium on principles of database systems (PODS’98)
Sweeney L (2002) k-anonymity: a model of protecting privacy. Int J Uncertainty Fuzziness Knowl Based Syst 10(5):557–570
Wong R, Li J, Fu A, Wang K (2006) \((\alpha , k)\)-anonymity: an enhanced k-anonymity model for privacy-preserving data publishing. In: Proceedings of 12th ACM SIGKDD international conference on knowledge discovery and data mining (KDD’06)
Xiao X, Tao Y (2007) m-invariance: towards privacy preserving re-publication of dynamic datasets. In: Proceedings of 2007 ACM SIGMOD international conference on management of data (SIGMOD’07)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer International Publishing Switzerland
About this paper
Cite this paper
Abul, O. (2013). Anonymity in Multi-Instance Micro-Data Publication. In: Gelenbe, E., Lent, R. (eds) Information Sciences and Systems 2013. Lecture Notes in Electrical Engineering, vol 264. Springer, Cham. https://doi.org/10.1007/978-3-319-01604-7_32
Download citation
DOI: https://doi.org/10.1007/978-3-319-01604-7_32
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-01603-0
Online ISBN: 978-3-319-01604-7
eBook Packages: Computer ScienceComputer Science (R0)