Skip to main content

What We Talk About When We Talk About Software Test Flakiness

  • Conference paper
  • First Online:
Quality of Information and Communications Technology (QUATIC 2021)

Abstract

Software test flakiness is drawing increasing interest among both academic researchers and practitioners. In this work we report our findings from a scoping review of white and grey literature, highlighting variations across flaky tests key concepts. Our study clearly indicates the need of a unifying definition as well as of a more comprehensive analysis for establishing a conceptual map that can better guide future research.

Work supported by the Italian MIUR PRIN 2017 Project: SISMA (Contract 201752ENYB), and partially by the Italian Research Group: INdAM-GNCS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 99.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 129.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    A useful comparison between systematic reviews, scoping reviews and other review types is available from https://guides.temple.edu/systematicreviews.

References

  1. Ahmad, A., Leifler, O., Sandahl, K.: Empirical analysis of factors and their effect on test flakiness-practitioners’ perceptions. arXiv preprint arXiv:1906.00673 (2019)

  2. Alshammari, A., Morris, C., Hilton, M., Bell, J.: FlakeFlagger: predicting flakiness without rerunning tests. In: Proceedings of ICSE Art. Ev. Track. IEEE (2021)

    Google Scholar 

  3. Arksey, H., O’Malley, L.: Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8(1), 19–32 (2005)

    Article  Google Scholar 

  4. Barboni, M., Bertolino, A., De Angelis, G.: Supplemental material: what we talk about when we talk about software test flakiness (2021)

    Google Scholar 

  5. Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., Marinov, D.: DeFlaker: automatically detecting flaky tests. In: Proceedings of ICSE, pp. 433–444. ACM (2018)

    Google Scholar 

  6. Carver, R.H., Tai, K.C.: Replay and testing for concurrent programs. IEEE Softw. 8(2), 66–74 (1991)

    Article  Google Scholar 

  7. Champier, C.: Flaky tests caused by a production bug: fix the flakiness, not the bug, February 2019. medium.com

  8. Cotroneo, D., Grottke, M., Natella, R., Pietrantuono, R., Trivedi, K.S.: Fault triggers in open-source software: an experience report. In: Proceedings of ISSRE, pp. 178–187. IEEE (2013)

    Google Scholar 

  9. Dutta, S., Shi, A., Choudhary, R., Zhang, Z., Jain, A., Misailovic, S.: Detecting flaky tests in probabilistic and machine learning applications. In: Proceedings of ISSTA, pp. 211–224. ACM (2020)

    Google Scholar 

  10. Eck, M., Palomba, F., Castelluccio, M., Bacchelli, A.: Understanding flaky tests: the developer’s perspective. In: Proceedings of ESEC/FSE, pp. 830–840. ACM (2019)

    Google Scholar 

  11. Eloussi, L.: Flaky tests (and how to avoid them), September 2016. medium.com

  12. Fowler, M.: Eradicating non-determinism in tests, April 2011

    Google Scholar 

  13. Groce, A., Holmes, J.: Practical automatic lightweight nondeterminism and flaky test detection and debugging for Python. In: Proceedings of QRS, pp. 188–195. IEEE (2020)

    Google Scholar 

  14. Gyori, A., Shi, A., Hariri, F., Marinov, D.: Reliable testing: detecting state-polluting tests to prevent test dependency. In: Proceedings of ISSTA, pp. 223–233. ACM (2015)

    Google Scholar 

  15. King, T.M., Santiago, D., Phillips, J., Clarke, P.J.: Towards a Bayesian network model for predicting flaky automated tests. In: Proceedings of QRS-C, pp. 100–107. IEEE (2018)

    Google Scholar 

  16. Kitchenham, B.: Procedures for performing systematic reviews. Keele UK Keele Univ. 33(2004), 1–26 (2004)

    Google Scholar 

  17. Kowalczyk, E., Nair, K., Gao, Z., Silberstein, L., Long, T., Memon, A.: Modeling and ranking flaky tests at Apple. In: Proceedings of ICSE-SEIP, pp. 110–119. ACM (2020)

    Google Scholar 

  18. Lam, W., Godefroid, P., Nath, S., Santhiar, A., Thummalapenta, S.: Root causing flaky tests in a large-scale industrial setting. In: Proceedings of ISSTA, pp. 101–111. ACM (2019)

    Google Scholar 

  19. Lam, W., Muşlu, K., Sajnani, H., Thummalapenta, S.: A study on the lifecycle of flaky tests. In: Proceedings of ICSE, pp. 1471–1482. ACM (2020)

    Google Scholar 

  20. Lam, W., Oei, R., Shi, A., Marinov, D., Xie, T.: iDFlakies: a framework for detecting and partially classifying flaky tests. In: Proceedings of ICST, pp. 312–322. IEEE (2019)

    Google Scholar 

  21. Lam, W., Shi, A., Oei, R., Zhang, S., Ernst, M.D., Xie, T.: Dependent-test-aware regression testing techniques. In: Proceedings of ISSTA, pp. 298–311. ACM (2020)

    Google Scholar 

  22. Lam, W., Winter, S., Astorga, A., Stodden, V., Marinov, D.: Understanding reproducibility and characteristics of flaky tests through test reruns in Java projects. In: Proceedings of ISSRE, pp. 403–413. IEEE (2020)

    Google Scholar 

  23. Lam, W., Winter, S., Wei, A., Xie, T., Marinov, D., Bell, J.: A large-scale longitudinal study of flaky tests. Proc. ACM Program. Lang. 4(OOPSLA), 1–29 (2020)

    Article  Google Scholar 

  24. Lee, B.: We have a flaky test problem, November 2019. medium.com

  25. Liviu, S.: A machine learning solution for detecting and mitigating flaky tests, October 2019. medium.com

  26. Luo, Q., Hariri, F., Eloussi, L., Marinov, D.: An empirical analysis of flaky tests. In: Proceedings of FSE, pp. 643–653. ACM (2014)

    Google Scholar 

  27. Machalica, M., Samylkin, A., Porth, M., Chandra, S.: Predictive test selection. In: Proceedings of ICSE-SEIP, pp. 91–100. IEEE (2019)

    Google Scholar 

  28. Malm, J., Causevic, A., Lisper, B., Eldh, S.: Automated analysis of flakiness-mitigating delays. In: Proceedings of AST, pp. 81–84. IEEE (2020)

    Google Scholar 

  29. Micco, J.: Flaky tests at Google and how we mitigate them, May 2016D

    Google Scholar 

  30. Munn, Z., Peters, M.D., Stern, C., Tufanaru, C., McArthur, A., Aromataris, E.: Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med. Res. Methodol. 18(1), 1–7 (2018)

    Article  Google Scholar 

  31. Otrebski, K.: Flaky tests, April 2018. medium.com

  32. Palmer, J.: Test flakiness - methods for identifying and dealing with flaky tests, November 2019. medium.com

  33. Parry, O., Kapfhammer, G.M., Hilton, M., McMinn, P.: Flake it’till you make it: using automated repair to induce and fix latent test flakiness. In: Proceedings of ICSE Workshops, pp. 11–12. ACM (2020)

    Google Scholar 

  34. Presler-Marshall, K., Horton, E., Heckman, S., Stolee, K.: Wait, wait. No, tell me. Analyzing selenium configuration effects on test flakiness. In: Proceedings of Wksp AST, pp. 7–13. IEEE (2019)

    Google Scholar 

  35. Rahman, M.T., Rigby, P.C.: The impact of failing, flaky, and high failure tests on the number of crash reports associated with Firefox builds. In: Proceedings of ESEC/FSE, pp. 857–862. ACM (2018)

    Google Scholar 

  36. Shi, A., Bell, J., Marinov, D.: Mitigating the effects of flaky tests on mutation testing. In: Proceedings of ISSTA, pp. 112–122. ACM (2019)

    Google Scholar 

  37. Shi, A., Gyori, A., Legunsen, O., Marinov, D.: Detecting assumptions on deterministic implementations of non-deterministic specifications. In: Proceedings of ICST, pp. 80–90. IEEE (2016)

    Google Scholar 

  38. Shi, A., Lam, W., Oei, R., Xie, T., Marinov, D.: iFixFlakies: a framework for automatically fixing order-dependent flaky tests. In: Proceedings of ESEC/FSE, pp. 545–555. ACM (2019)

    Google Scholar 

  39. Silva, D., Teixeira, L., d’Amorim, M.: Shake it! Detecting flaky tests caused by concurrency with Shaker. In: Proceedings of ICSME, pp. 301–311. IEEE (2020)

    Google Scholar 

  40. Słapiński, M.: What is flakiness and how we deal with it, February 2020. medium.com

  41. Stosik, D.: Dealing with flaky tests, November 2019. medium.com

  42. Strandberg, P.E., Ostrand, T.J., Weyuker, E.J., Afzal, W., Sundmark, D.: Intermittently failing tests in the embedded systems domain. In: Proceedings of ISSTA, pp. 337–348. ACM (2020)

    Google Scholar 

  43. Terragni, V., Salza, P., Ferrucci, F.: A container-based infrastructure for fuzzy-driven root causing of flaky tests. In: Proceedings of ICSE-NIER, pp. 69–72. IEEE (2020)

    Google Scholar 

  44. Thorve, S., Sreshtha, C., Meng, N.: An empirical study of flaky tests in android apps. In: Proceedings of ICSME, pp. 534–538. IEEE (2018)

    Google Scholar 

  45. Vahabzadeh, A., Fard, A.M., Mesbah, A.: An empirical study of bugs in test code. In: Proceedings of ICSME, pp. 101–110. IEEE (2015)

    Google Scholar 

  46. Waterloo, M., Person, S., Elbaum, S.: Test analysis: searching for faults in tests (N). In: Proceedings of ASE. IEEE, November 2015

    Google Scholar 

  47. Zhang, S., et al.: Empirically revisiting the test independence assumption. In: Proceedings of ISSTA, pp. 385–396. ACM (2014)

    Google Scholar 

  48. Ziftci, C., Cavalcanti, D.: De-Flake your tests: automatically locating root causes of flaky tests in code at Google. In: Proceedings of ICSME, pp. 736–745. IEEE (2020)

    Google Scholar 

  49. Zolfaghari, B., Parizi, R.M., Srivastava, G., Hailemariam, Y.: Root causing, detecting, and fixing flaky tests: state of the art and future roadmap. Softw.: Pract. Exp. 51, 851–867 (2020)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Morena Barboni .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Barboni, M., Bertolino, A., De Angelis, G. (2021). What We Talk About When We Talk About Software Test Flakiness. In: Paiva, A.C.R., Cavalli, A.R., Ventura Martins, P., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2021. Communications in Computer and Information Science, vol 1439. Springer, Cham. https://doi.org/10.1007/978-3-030-85347-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85347-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85346-4

  • Online ISBN: 978-3-030-85347-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics