Straggler Management

Zawad, Syed; Yan, Feng; Anwar, Ali

doi:10.1007/978-3-030-96896-0_11

Syed Zawad³,
Feng Yan³ &
Ali Anwar⁴

2673 Accesses

Abstract

For this chapter, we elaborate on one of the most common challenge in Federated Learning—stragglers. The chapters “Local Training and Scalability of Federated Learning Systems“ and “Introduction to Federated Learning Systems“ have talked briefly about it, and we delve even deeper here. We first provide an introduction on what the problem is and why it is important. We talk about a study to show the effect of stragglers in a practical setting. As an example, we then talk about TiFL, a framework that proposes to solve such a problem using grouping. Empirical results are presented to show how such systems may help mitigate the effect of stragglers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
http://yann.lecun.com/exdb/mnist/.

References

Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: A system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283
Google Scholar
Beimel A, Kasiviswanathan SP, Nissim K (2010) Bounds on the sample complexity for private learning and private data release. In: Theory of cryptography conference. Springer, pp 437–454
Google Scholar
Bonawitz K, Ivanov V, Kreuter B, Marcedone A, McMahan HB, Patel S, Ramage D, Segal A, Seth K (2017) Practical secure aggregation for privacy-preserving machine learning. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security. ACM, pp 1175–1191
Google Scholar
Bonawitz K, Eichner H, Grieskamp W, Huba D, Ingerman A, Ivanov V, Kiddon C, Konecny J, Mazzocchi S, McMahan HB et al (2019) Towards federated learning at scale: System design. Preprint. arXiv:1902.01046
Google Scholar
Caldas S, Konečny J, McMahan HB, Talwalkar A (2018) Expanding the reach of federated learning by reducing client resource requirements. Preprint. arXiv:1812.07210
Google Scholar
Caldas S, Wu P, Li T, Konečnỳ J, McMahan HB, Smith V, Talwalkar A (2018) Leaf: A benchmark for federated settings. Preprint. arXiv:1812.01097
Google Scholar
Chai Z, Fayyaz H, Fayyaz Z, Anwar A, Zhou Y, Baracaldo N, Ludwig H, Cheng Y (2019) Towards taming the resource and data heterogeneity in federated learning. In: 2019 USENIX conference on operational machine learning (OpML 19), pp 19–21
Google Scholar
Chai Z, Ali A, Zawad S, Truex S, Anwar A, Baracaldo N, Zhou Y, Ludwig H, Yan F, Cheng Y (2020) Tifl: A tier-based federated learning system. In: Proceedings of the 29th international symposium on high-performance parallel and distributed computing, pp 125–136
Google Scholar
Kairouz P, McMahan HB, Avent B, Bellet A, Bennis M, Bhagoji AN, Bonawitz K, Charles Z, Cormode G, Cummings R et al (2019) Advances and open problems in federated learning. Preprint. arXiv:1912.04977
Google Scholar
Konečnỳ J, McMahan HB, Yu FX, Richtárik P, Suresh AT, Bacon D (2016) Federated learning: Strategies for improving communication efficiency. Preprint. arXiv:1610.05492
Google Scholar
Krizhevsky A, Nair V, Hinton G (2014) The cifar-10 dataset. Online: http://www.cs.toronto.edu/kriz/cifar.html, 55
Li T, Sahu AK, Talwalkar A, Smith V (2019) Federated learning: Challenges, methods, and future directions. Preprint. arXiv:1908.07873
Google Scholar
Liu L, Zhang J, Song SH, Letaief KB (2019) Edge-assisted hierarchical federated learning with non-iid data. Preprint. arXiv:1905.06641
Google Scholar
McMahan HB, Moore E, Ramage D, Hampson S et al (2016) Communication-efficient learning of deep networks from decentralized data. Preprint. arXiv:1602.05629
Google Scholar
Xiao H, Rasul K, Vollgraf R (2017) Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms. Preprint. arXiv:1708.07747
Google Scholar
Yang Q, Liu Y, Chen T, Tong Y (2019) Federated machine learning: Concept and applications. ACM Trans Intell Syst Technol (TIST) 10(2):12
Google Scholar
Zhao Y, Li M, Lai L, Suda N, Civin D, Chandra V (2018) Federated learning with non-iid data. Preprint. arXiv:1806.00582
Google Scholar

Download references

Author information

Authors and Affiliations

University of Nevada, Reno, Reno, NV, USA
Syed Zawad & Feng Yan
IBM Research – Almaden, San Jose, CA, USA
Ali Anwar

Authors

Syed Zawad
View author publications
You can also search for this author in PubMed Google Scholar
Feng Yan
View author publications
You can also search for this author in PubMed Google Scholar
Ali Anwar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Syed Zawad .

Editor information

Editors and Affiliations

IBM Research – Almaden, San Jose, CA, USA
Heiko Ludwig
IBM Research -- Almaden, San Jose, CA, USA
Nathalie Baracaldo

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Zawad, S., Yan, F., Anwar, A. (2022). Straggler Management. In: Ludwig, H., Baracaldo, N. (eds) Federated Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-96896-0_11

Download citation

DOI: https://doi.org/10.1007/978-3-030-96896-0_11
Published: 08 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-96895-3
Online ISBN: 978-3-030-96896-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics