What We Talk About When We Talk About Software Test Flakiness

Barboni, Morena; Bertolino, Antonia; De Angelis, Guglielmo

doi:10.1007/978-3-030-85347-1_3

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1439))

Included in the following conference series:

International Conference on the Quality of Information and Communications Technology

1349 Accesses
5 Citations

Abstract

Software test flakiness is drawing increasing interest among both academic researchers and practitioners. In this work we report our findings from a scoping review of white and grey literature, highlighting variations across flaky tests key concepts. Our study clearly indicates the need of a unifying definition as well as of a more comprehensive analysis for establishing a conceptual map that can better guide future research.

Work supported by the Italian MIUR PRIN 2017 Project: SISMA (Contract 201752ENYB), and partially by the Italian Research Group: INdAM-GNCS.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 99.00; Price excludes VAT (USA)

Softcover Book: USD 129.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
A useful comparison between systematic reviews, scoping reviews and other review types is available from https://guides.temple.edu/systematicreviews.

References

Ahmad, A., Leifler, O., Sandahl, K.: Empirical analysis of factors and their effect on test flakiness-practitioners’ perceptions. arXiv preprint arXiv:1906.00673 (2019)
Alshammari, A., Morris, C., Hilton, M., Bell, J.: FlakeFlagger: predicting flakiness without rerunning tests. In: Proceedings of ICSE Art. Ev. Track. IEEE (2021)
Google Scholar
Arksey, H., O’Malley, L.: Scoping studies: towards a methodological framework. Int. J. Soc. Res. Methodol. 8(1), 19–32 (2005)
Article Google Scholar
Barboni, M., Bertolino, A., De Angelis, G.: Supplemental material: what we talk about when we talk about software test flakiness (2021)
Google Scholar
Bell, J., Legunsen, O., Hilton, M., Eloussi, L., Yung, T., Marinov, D.: DeFlaker: automatically detecting flaky tests. In: Proceedings of ICSE, pp. 433–444. ACM (2018)
Google Scholar
Carver, R.H., Tai, K.C.: Replay and testing for concurrent programs. IEEE Softw. 8(2), 66–74 (1991)
Article Google Scholar
Champier, C.: Flaky tests caused by a production bug: fix the flakiness, not the bug, February 2019. medium.com
Cotroneo, D., Grottke, M., Natella, R., Pietrantuono, R., Trivedi, K.S.: Fault triggers in open-source software: an experience report. In: Proceedings of ISSRE, pp. 178–187. IEEE (2013)
Google Scholar
Dutta, S., Shi, A., Choudhary, R., Zhang, Z., Jain, A., Misailovic, S.: Detecting flaky tests in probabilistic and machine learning applications. In: Proceedings of ISSTA, pp. 211–224. ACM (2020)
Google Scholar
Eck, M., Palomba, F., Castelluccio, M., Bacchelli, A.: Understanding flaky tests: the developer’s perspective. In: Proceedings of ESEC/FSE, pp. 830–840. ACM (2019)
Google Scholar
Eloussi, L.: Flaky tests (and how to avoid them), September 2016. medium.com
Fowler, M.: Eradicating non-determinism in tests, April 2011
Google Scholar
Groce, A., Holmes, J.: Practical automatic lightweight nondeterminism and flaky test detection and debugging for Python. In: Proceedings of QRS, pp. 188–195. IEEE (2020)
Google Scholar
Gyori, A., Shi, A., Hariri, F., Marinov, D.: Reliable testing: detecting state-polluting tests to prevent test dependency. In: Proceedings of ISSTA, pp. 223–233. ACM (2015)
Google Scholar
King, T.M., Santiago, D., Phillips, J., Clarke, P.J.: Towards a Bayesian network model for predicting flaky automated tests. In: Proceedings of QRS-C, pp. 100–107. IEEE (2018)
Google Scholar
Kitchenham, B.: Procedures for performing systematic reviews. Keele UK Keele Univ. 33(2004), 1–26 (2004)
Google Scholar
Kowalczyk, E., Nair, K., Gao, Z., Silberstein, L., Long, T., Memon, A.: Modeling and ranking flaky tests at Apple. In: Proceedings of ICSE-SEIP, pp. 110–119. ACM (2020)
Google Scholar
Lam, W., Godefroid, P., Nath, S., Santhiar, A., Thummalapenta, S.: Root causing flaky tests in a large-scale industrial setting. In: Proceedings of ISSTA, pp. 101–111. ACM (2019)
Google Scholar
Lam, W., Muşlu, K., Sajnani, H., Thummalapenta, S.: A study on the lifecycle of flaky tests. In: Proceedings of ICSE, pp. 1471–1482. ACM (2020)
Google Scholar
Lam, W., Oei, R., Shi, A., Marinov, D., Xie, T.: iDFlakies: a framework for detecting and partially classifying flaky tests. In: Proceedings of ICST, pp. 312–322. IEEE (2019)
Google Scholar
Lam, W., Shi, A., Oei, R., Zhang, S., Ernst, M.D., Xie, T.: Dependent-test-aware regression testing techniques. In: Proceedings of ISSTA, pp. 298–311. ACM (2020)
Google Scholar
Lam, W., Winter, S., Astorga, A., Stodden, V., Marinov, D.: Understanding reproducibility and characteristics of flaky tests through test reruns in Java projects. In: Proceedings of ISSRE, pp. 403–413. IEEE (2020)
Google Scholar
Lam, W., Winter, S., Wei, A., Xie, T., Marinov, D., Bell, J.: A large-scale longitudinal study of flaky tests. Proc. ACM Program. Lang. 4(OOPSLA), 1–29 (2020)
Article Google Scholar
Lee, B.: We have a flaky test problem, November 2019. medium.com
Liviu, S.: A machine learning solution for detecting and mitigating flaky tests, October 2019. medium.com
Luo, Q., Hariri, F., Eloussi, L., Marinov, D.: An empirical analysis of flaky tests. In: Proceedings of FSE, pp. 643–653. ACM (2014)
Google Scholar
Machalica, M., Samylkin, A., Porth, M., Chandra, S.: Predictive test selection. In: Proceedings of ICSE-SEIP, pp. 91–100. IEEE (2019)
Google Scholar
Malm, J., Causevic, A., Lisper, B., Eldh, S.: Automated analysis of flakiness-mitigating delays. In: Proceedings of AST, pp. 81–84. IEEE (2020)
Google Scholar
Micco, J.: Flaky tests at Google and how we mitigate them, May 2016D
Google Scholar
Munn, Z., Peters, M.D., Stern, C., Tufanaru, C., McArthur, A., Aromataris, E.: Systematic review or scoping review? Guidance for authors when choosing between a systematic or scoping review approach. BMC Med. Res. Methodol. 18(1), 1–7 (2018)
Article Google Scholar
Otrebski, K.: Flaky tests, April 2018. medium.com
Palmer, J.: Test flakiness - methods for identifying and dealing with flaky tests, November 2019. medium.com
Parry, O., Kapfhammer, G.M., Hilton, M., McMinn, P.: Flake it’till you make it: using automated repair to induce and fix latent test flakiness. In: Proceedings of ICSE Workshops, pp. 11–12. ACM (2020)
Google Scholar
Presler-Marshall, K., Horton, E., Heckman, S., Stolee, K.: Wait, wait. No, tell me. Analyzing selenium configuration effects on test flakiness. In: Proceedings of Wksp AST, pp. 7–13. IEEE (2019)
Google Scholar
Rahman, M.T., Rigby, P.C.: The impact of failing, flaky, and high failure tests on the number of crash reports associated with Firefox builds. In: Proceedings of ESEC/FSE, pp. 857–862. ACM (2018)
Google Scholar
Shi, A., Bell, J., Marinov, D.: Mitigating the effects of flaky tests on mutation testing. In: Proceedings of ISSTA, pp. 112–122. ACM (2019)
Google Scholar
Shi, A., Gyori, A., Legunsen, O., Marinov, D.: Detecting assumptions on deterministic implementations of non-deterministic specifications. In: Proceedings of ICST, pp. 80–90. IEEE (2016)
Google Scholar
Shi, A., Lam, W., Oei, R., Xie, T., Marinov, D.: iFixFlakies: a framework for automatically fixing order-dependent flaky tests. In: Proceedings of ESEC/FSE, pp. 545–555. ACM (2019)
Google Scholar
Silva, D., Teixeira, L., d’Amorim, M.: Shake it! Detecting flaky tests caused by concurrency with Shaker. In: Proceedings of ICSME, pp. 301–311. IEEE (2020)
Google Scholar
Słapiński, M.: What is flakiness and how we deal with it, February 2020. medium.com
Stosik, D.: Dealing with flaky tests, November 2019. medium.com
Strandberg, P.E., Ostrand, T.J., Weyuker, E.J., Afzal, W., Sundmark, D.: Intermittently failing tests in the embedded systems domain. In: Proceedings of ISSTA, pp. 337–348. ACM (2020)
Google Scholar
Terragni, V., Salza, P., Ferrucci, F.: A container-based infrastructure for fuzzy-driven root causing of flaky tests. In: Proceedings of ICSE-NIER, pp. 69–72. IEEE (2020)
Google Scholar
Thorve, S., Sreshtha, C., Meng, N.: An empirical study of flaky tests in android apps. In: Proceedings of ICSME, pp. 534–538. IEEE (2018)
Google Scholar
Vahabzadeh, A., Fard, A.M., Mesbah, A.: An empirical study of bugs in test code. In: Proceedings of ICSME, pp. 101–110. IEEE (2015)
Google Scholar
Waterloo, M., Person, S., Elbaum, S.: Test analysis: searching for faults in tests (N). In: Proceedings of ASE. IEEE, November 2015
Google Scholar
Zhang, S., et al.: Empirically revisiting the test independence assumption. In: Proceedings of ISSTA, pp. 385–396. ACM (2014)
Google Scholar
Ziftci, C., Cavalcanti, D.: De-Flake your tests: automatically locating root causes of flaky tests in code at Google. In: Proceedings of ICSME, pp. 736–745. IEEE (2020)
Google Scholar
Zolfaghari, B., Parizi, R.M., Srivastava, G., Hailemariam, Y.: Root causing, detecting, and fixing flaky tests: state of the art and future roadmap. Softw.: Pract. Exp. 51, 851–867 (2020)
Google Scholar

Download references

Author information

Authors and Affiliations

IASI-CNR, Rome, Italy
Morena Barboni & Guglielmo De Angelis
ISTI-CNR, Pisa, Italy
Antonia Bertolino

Authors

Morena Barboni
View author publications
You can also search for this author in PubMed Google Scholar
Antonia Bertolino
View author publications
You can also search for this author in PubMed Google Scholar
Guglielmo De Angelis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Morena Barboni .

Editor information

Editors and Affiliations

Faculty of Engineering of the University of Porto, Porto, Portugal
Ana C. R. Paiva
Institut Polytechnique de Paris, Paris, France
Ana Rosa Cavalli
University of Algarve, Faro, Portugal
Paula Ventura Martins
University of Castila-La Mancha, Ciudad Real, Ciudad Real, Spain
Ricardo Pérez-Castillo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Barboni, M., Bertolino, A., De Angelis, G. (2021). What We Talk About When We Talk About Software Test Flakiness. In: Paiva, A.C.R., Cavalli, A.R., Ventura Martins, P., Pérez-Castillo, R. (eds) Quality of Information and Communications Technology. QUATIC 2021. Communications in Computer and Information Science, vol 1439. Springer, Cham. https://doi.org/10.1007/978-3-030-85347-1_3

Download citation

DOI: https://doi.org/10.1007/978-3-030-85347-1_3
Published: 25 August 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85346-4
Online ISBN: 978-3-030-85347-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics