Abstract
Federated learning (FL) is a new machine learning paradigm to collaboratively learn an intelligent model across many clients without uploading local data to the server. Non-IID data across clients is a major challenge for the FL system because its inherited distributed machine learning framework is designed for the scenario of IID data across clients. Clustered FL is a type of FL method to solve non-IID challenges using a client clustering method in the FL context. However, existing clustered FL methods suffer the challenge of processing client-wise outliers which could be produced by minority clients with abnormal behaviour patterns or be derived from malicious clients. This paper is to propose a novel Federated learning framework with Robust Clustering (FedRoC) to tackle client-wise outliers in the FL system. Specifically, we will develop a robust federated aggregation operator using a bootstrap median-of-means mechanism that can produce a higher breakdown point to tolerate a larger proportion of outliers. We formulate the proposed FL framework into a bi-level optimization problem, and then a stochastic expectation-maximization method is adopted to solve the optimization problem in an alternative updating manner by considering EM steps and distributed computing simultaneously. The experiments on three benchmark datasets have demonstrated the effectiveness of the proposed method that outperforms other baseline methods in terms of evaluation criteria.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ana, L.F., Jain, A.K.: Robust data clustering. In: 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings, vol. 2, p. II. IEEE (2003)
Arthur, D., Vassilvitskii, S.: k-means++: The advantages of careful seeding. Tech. rep, Stanford (2006)
Bhagoji, A.N., Chakraborty, S., Mittal, P., Calo, S.: Analyzing federated learning through an adversarial lens. In: International Conference on Machine Learning, pp. 634–643. PMLR (2019)
Blanchard, P., El Mhamdi, E.M., Guerraoui, R., Stainer, J.: Machine learning with adversaries: Byzantine tolerant gradient descent. In: Advances in Neural Information Processing Systems 30 (2017)
Brunet-Saumard, C., Genetay, E., Saumard, A.: K-bMOM: A robust Lloyd-type clustering algorithm based on bootstrap median-of-means. Comput. Stat. Data Anal. 167, 107370 (2022)
Caldas, S., et al.: Leaf: a benchmark for federated settings. arXiv preprint arXiv:1812.01097 (2018)
Chen, F., Long, G., Wu, Z., Zhou, T., Jiang, J.: Personalized federated learning with graph. arXiv preprint arXiv:2203.00829 (2022)
Cohen, G., Afshar, S., Tapson, J., Van Schaik, A.: EMNIST: extending MNIST to handwritten letters. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 2921–2926. IEEE (2017)
Davé, R.N., Krishnapuram, R.: Robust clustering methods: a unified view. IEEE Trans. Fuzzy Syst. 5(2), 270–293 (1997)
Deshpande, A., Kacham, P., Pratap, R.: Robust \(k\)-means++. In: Conference on Uncertainty in Artificial Intelligence, pp. 799–808. PMLR (2020)
Dinh, C.T., Tran, N.H., Nguyen, T.D.: Personalized federated learning with moreau envelopes. arXiv preprint arXiv:2006.08848 (2020)
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: A general trimming approach to robust cluster analysis. Ann. Stat. 36(3), 1324–1345 (2008)
García-Escudero, L.A., Gordaliza, A., Matrán, C., Mayo-Iscar, A.: A review of robust clustering methods. Adv. Data Anal. Classif. 4(2), 89–109 (2010)
Ge, Y.F., Cao, J., Wang, H., Chen, Z., Zhang, Y.: Set-based adaptive distributed differential evolution for anonymity-driven database fragmentation. Data Sci. Eng. 6(4), 380–391 (2021)
Ghosh, A., Chung, J., Yin, D., Ramchandran, K.: An efficient framework for clustered federated learning. arXiv preprint arXiv:2006.04088 (2020)
Ghosh, A., Hong, J., Yin, D., Ramchandran, K.: Robust federated learning in a heterogeneous environment. arXiv preprint arXiv:1906.06629 (2019)
Guha, S., Rastogi, R., Shim, K.: Rock: a robust clustering algorithm for categorical attributes. Inf. Syst. 25(5), 345–366 (2000)
Huang, Y., et al.: Personalized cross-silo federated learning on Non-IID data. In: AAAI, pp. 7865–7873 (2021)
Kairouz, P., McMahan, H.B., et al.: Advances and open problems in federated learning. arXiv preprint arXiv:1912.04977 (2019)
Kumagai, A., Iwata, T., Fujiwara, Y.: Transfer metric learning for unseen domains. Data Sci. Eng. 5(2), 140–151 (2020)
Li, T., Sanjabi, M., Smith, V.: Fair resource allocation in federated learning. CoRR abs/1905.10497 (2019). http://arxiv.org/abs/1905.10497
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of the IEEE ICCV, pp. 3730–3738 (2015)
Long, G., Shen, T., Tan, Y., Gerrard, L., Clarke, A., Jiang, J.: Federated learning for privacy-preserving open innovation future on digital health. In: Chen, F., Zhou, J. (eds.) Humanity Driven AI, pp. 113–133. Springer, Cham (2022). https://doi.org/10.1007/978-3-030-72188-6_6
Long, G., Tan, Y., Jiang, J., Zhang, C.: Federated learning for open banking. In: Yang, Q., Fan, L., Yu, H. (eds.) Federated Learning. LNCS (LNAI), vol. 12500, pp. 240–254. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-63076-8_17
Luo, J., et al.: Real-world image datasets for federated learning. arXiv preprint arXiv:1910.11089 (2019)
Ma, J., Long, G., Zhou, T., Jiang, J., Zhang, C.: On the convergence of clustered federated learning. arXiv preprint arXiv:2202.06187 (2022)
Mansour, Y., Mohri, M., Ro, J., Suresh, A.T.: Three approaches for personalization with applications to federated learning. arXiv preprint arXiv:2002.10619 (2020)
McMahan, B., Moore, E., Ramage, D., Hampson, S., y Arcas, B.A.: Communication-efficient learning of deep networks from decentralized data. In: Artificial Intelligence and Statistics, pp. 1273–1282. PMLR (2017)
Paul, D., Chakraborty, S., Das, S.: Robust principal component analysis: a median of means approach. arXiv preprint arXiv:2102.03403 (2021)
Sattler, F., Müller, K.R., Samek, W.: Clustered federated learning: Model-agnostic distributed multi-task optimization under privacy constraints. arXiv preprint arXiv:1910.01991 (2019)
Smith, V., Chiang, C.K., Sanjabi, M., Talwalkar, A.S.: Federated multi-task learning. In: Advances in Neural Information Processing Systems 30 (2017)
Tan, Y., et al.: Fedproto: federated prototype learning across heterogeneous clients. In: AAAI Conference on Artificial Intelligence, vol. 1, p. 3 (2022)
Wang, H., Yurochkin, M., Sun, Y., Papailiopoulos, D., Khazaeni, Y.: Federated learning with matched averaging. In: International Conference on Learning Representations (2020). https://openreview.net/forum?id=BkluqlSFDS
Wang, Z., Zhou, T., Long, G., Han, B., Jiang, J.: FedNoiL: a simple two-level sampling method for federated learning with noisy labels. arXiv preprint arXiv:2205.10110 (2022)
Xie, M., et al.: Multi-center federated learning. arXiv preprint arXiv:2108.08647 (2021)
Yang, M.S., Lai, C.Y., Lin, C.Y.: A robust EM clustering algorithm for gaussian mixture models. Pattern Recogn. 45(11), 3950–3961 (2012)
Yang, M.S., Wu, K.L.: A similarity-based robust clustering method. IEEE Trans. Pattern Anal. Mach. Intell. 26(4), 434–448 (2004)
Yang, Q., Liu, Y., Chen, T., Tong, Y.: Federated machine learning: concept and applications. ACM Trans. Intell. Syst. Technol. (TIST) 10(2), 12 (2019)
Yin, D., Chen, Y., Kannan, R., Bartlett, P.: Defending against saddle point attack in byzantine-robust distributed learning. In: International Conference on Machine Learning, pp. 7074–7084. PMLR (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Xie, M., MA, J., Long, G., Zhang, C. (2023). Robust Clustered Federated Learning with Bootstrap Median-of-Means. In: Li, B., Yue, L., Tao, C., Han, X., Calvanese, D., Amagasa, T. (eds) Web and Big Data. APWeb-WAIM 2022. Lecture Notes in Computer Science, vol 13421. Springer, Cham. https://doi.org/10.1007/978-3-031-25158-0_19
Download citation
DOI: https://doi.org/10.1007/978-3-031-25158-0_19
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-25157-3
Online ISBN: 978-3-031-25158-0
eBook Packages: Computer ScienceComputer Science (R0)