Skip to main content

A k-Anonymised Federated Learning Framework with Decision Trees

  • Conference paper
  • First Online:
Data Privacy Management, Cryptocurrencies and Blockchain Technology (DPM 2021, CBT 2021)

Abstract

We propose a privacy-preserving framework using Mondrian k-anonymity with decision trees in a Federated Learning (FL) setting for the horizontally partitioned data. Data heterogeneity in FL makes the data non-IID (Non-Independent and Identically Distributed). We use a novel approach to create non-IID partitions of data by solving an optimization problem. In this work, each device trains a decision tree classifier. Devices share the root node of their trees with the aggregator. The aggregator merges the trees by choosing the most common split attribute and grows the branches based on the split values of the chosen split attribute. This recursive process stops when all the nodes to be merged are leaf nodes. After the merging operation, the aggregator sends the merged decision tree to the distributed devices. Therefore, we aim to build a joint machine learning model based on the data from multiple devices while offering k-anonymity to the participants.

This study was partially funded by the Wallenberg AI, Autonomous Systems and Software Program (WASP) funded by the Knut and Alice Wallenberg Foundation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://github.com/Nuclearstar/K-Anonymity.

References

  1. Dode, A.: The challenges of implementing general data protection law (GDPR). In: 14th International Conference “Standardization, Protypes and Quality: A Means of Balkan Countries’collaboration”, p. 65 (2018)

    Google Scholar 

  2. Anonymisation and GDPR compliance; an overview - GDPR Summary (2021). https://www.gdprsummary.com/anonymisation-and-gdpr/. Accessed 27 June 2021

  3. McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR, April 2017

    Google Scholar 

  4. Dua, D., Graff, C.: UCI machine learning repository. University of California, School of Information and Computer Science, Irvine (2019). http://archive.ics.uci.edu/ml

  5. Peng, W., Chen, J., Zhou, H.: An implementation of ID3-decision tree learning algorithm. arch.usyd.edu.au/wpeng/DecisionTree2.pdf. Accessed 13 May 2009

  6. Steinberg, D., Colla, P.: CART: classification and regression trees. Top Ten Algorithms Data Min. 9, 179 (2009)

    Article  Google Scholar 

  7. LeFevre, K., DeWitt, D.J., Ramakrishnan, R.: Mondrian multidimensional k-anonymity. In: 22nd International Conference on Data Engineering (ICDE 2006), p. 25. IEEE, April 2006

    Google Scholar 

  8. Slijepčević, D., Henzl, M., Klausner, L.D., Dam, T., Kieseberg, P., Zeppelzauer, M.: k-anonymity in practice: how generalisation and suppression affect machine learning classifiers. arXiv preprint arXiv:2102.04763 (2021)

  9. Buratović, I., Miličević, M., Žubrinić, K.: Effects of data anonymisation on the data mining results. In: 2012 Proceedings of the 35th International Convention MIPRO, pp. 1619–1623. IEEE, May 2012

    Google Scholar 

  10. Fan, C., Li, P.: Classification acceleration via merging decision trees. In: Proceedings of the 2020 ACM-IMS on Foundations of Data Science Conference, pp. 13–22 (2020)

    Google Scholar 

  11. Hall, L., Chawla, N., Bowyer, K.: Combining decision trees learned in parallel. In: Working Notes of the KDD-97 Workshop on Distributed Data Mining, pp. 10–15 (1998)

    Google Scholar 

  12. Bursteinas, B., Long, J.: Merging distributed classifiers. In: Proceedings of 5th World Multi-conference on Systemics, Cybernetics and Informatics (2001)

    Google Scholar 

  13. Andrzejak, A., Langner, F., Zabala, S.: Interpretable models from distributed data via merging of decision trees. In: Proceedings of 2013 IEEE Symposium on Computational Intelligence and Data Mining (CIDM) (2013)

    Google Scholar 

  14. Strecht, P., Mendes-Moreira, J., Soares, C.: Merging decision trees: a case study in predicting student performance. In: Luo, X., Yu, J.X., Li, Z. (eds.) ADMA 2014. LNCS (LNAI), vol. 8933, pp. 535–548. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-14717-8_42

    Chapter  Google Scholar 

  15. Sweeney, L.: k-anonymity: a model for protecting privacy. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 557–570 (2002)

    Article  MathSciNet  Google Scholar 

  16. Samarati, P., Sweeney, L.: Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression (1998)

    Google Scholar 

  17. Samarati, P.: Protecting respondents identities in microdata release. IEEE Trans. Knowl. Data Eng. 13(6), 1010–1027 (2001)

    Article  Google Scholar 

  18. Sweeney, L.: Achieving k-anonymity privacy protection using generalization and suppression. Int. J. Uncertain. Fuzziness Knowl.-Based Syst. 10(05), 571–588 (2002)

    Article  MathSciNet  Google Scholar 

  19. Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning. Springer, Heidelberg (2009). https://doi.org/10.1007/978-0-387-84858-7

    Book  MATH  Google Scholar 

  20. Torra, V.: A systematic construction of non-I.I.D. data sets from a single dataset, manuscript (2021)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Saloni Kwatra .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Kwatra, S., Torra, V. (2022). A k-Anonymised Federated Learning Framework with Decision Trees. In: Garcia-Alfaro, J., Muñoz-Tapia, J.L., Navarro-Arribas, G., Soriano, M. (eds) Data Privacy Management, Cryptocurrencies and Blockchain Technology. DPM CBT 2021 2021. Lecture Notes in Computer Science(), vol 13140. Springer, Cham. https://doi.org/10.1007/978-3-030-93944-1_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-93944-1_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-93943-4

  • Online ISBN: 978-3-030-93944-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics