Evaluating Three Approaches to Extracting Fault Data from Software Change Repositories

Hall, Tracy; Bowes, David; Liebchen, Gernot; Wernick, Paul

doi:10.1007/978-3-642-13792-1_10

Tracy Hall¹⁹,
David Bowes²⁰,
Gernot Liebchen¹⁹ &
…
Paul Wernick²⁰

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 6156))

Included in the following conference series:

International Conference on Product Focused Software Process Improvement

1697 Accesses
6 Citations

Abstract

Software products can only be improved if we have a good understanding of the faults they typically contain. Code faults are a significant source of software product problems which we currently do not understand sufficiently. Open source change repositories are potentially a rich and valuable source of fault data for both researchers and practitioners. Such fault data can be used to better understand current product problems so that we can predict and address future product problems. However extracting fault data from change repositories is difficult. In this paper we compare the performance of three approaches to extracting fault data from the change repository of the Barcode Open Source System. Our main findings are that we have most confidence in our manual evaluation of diffs to identify fault fixing changes. We had less confidence in the ability of the two automatic approaches to separate fault fixing from non-fault fixing changes. We conclude that it is very difficult to reliably extract fault fixing data from change repositories, especially using automatic tools and that we need to be cautious when reporting or using such data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Levinson, M.: Let’s stop wasting $78 billion a year. CIO Magazine (2001)
Google Scholar
Runeson, P., Andrews, A.: Detection or Isolation of Defects? An Experimental Comparison of Unit Testing and Code Inspection. In: ISSRE 2003, pp. 3–13 (2003)
Google Scholar
Di Fatta, G., Leue, S., Stegantova, E.: Dis-criminative Pattern Mining in Software Fault Detection. In: SOQUA Workshop (2006)
Google Scholar
Turhan, B., Kocak, G., Bener, A.: Data mining source code for locating software bugs: A case study in telecommunication industry. Expert Syst. Appl. 36, 6 (2009)
Article Google Scholar
Bezerra, M.E.R., Oliveira, A.L.I., Adeodato, P.J.L., Meira, S.R.L.: Enhancing RBF-DDA Algorithm’s Robustness: Neural Networks Applied to Prediction of Fault-Prone Software Modules. In: Artificial Intelligence in Theory and Practice II (2007)
Google Scholar
Oral, A.D., Bener, A.: Defect prediction for embedded software. In: Proceedings of the 22nd International Symposium on Computer and Information Sciences, pp. 1–6 (2007)
Google Scholar
Pai, G.J., Dugan, J.B.: Empirical Analysis of Software Fault Content and Fault Proneness Using Bayesian Methods. IEEE Trans. Software Eng. 33(10), 675–686 (2007)
Article Google Scholar
Tomaszewski, P., Håkansson, J., Grahn, H., Lundberg, L.: Statistical models vs. expert estimation for fault prediction in modified code – An industrial case study. Journal of Systems and Software 80(8), 1227–1238 (2007)
Article Google Scholar
Zimmermann, T., Premraj, R., Zeller, A.: Predicting defects for eclipse. In: Proceedings of the Third International Workshop on Predictor Models in Software Engineering (2007)
Google Scholar
Sliwerski, J., Zimmermann, T., Zeller, A.: When do changes induce fixes? In: Proceedings of the Second International Workshop on Mining Software Repositories, pp. 24–28 (2005)
Google Scholar
Schröter, A., Zimmermann, T., Premraj, R., Zeller, A.: Where do bugs come from? SIGSOFT Softw. Eng. Notes 31(6), 1–2 (2006)
Article Google Scholar
Weyuker, E.J., Ostrand, T.J.: Comparing methods to identify defect reports in a change management database. In: DEFECTS 2008: Proceedings of the 2008 workshop on Defects in large software systems, pp. 27–31 (2008)
Google Scholar
Ostrand, T.J., Weyuker, E.J., Bell, R.M.: Predicting the location and number of faults in large software systems. IEEE Trans. Software Eng. 31(4), 340–355 (2005)
Article Google Scholar
Zimmermann, T., Weissgerber, P.: Preprocessing cvs data for fine-grained analysis. In: Proceedings of the First International Workshop on Mining Software Repositories, pp. 2–6 (2004)
Google Scholar
Meyers, T.M., Binkley, D.: An empirical study of slice-based cohesion and coupling metrics. ACM Trans. Softw. Eng. Methodol. 17(1), 1–27 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Information Systems & Computing, Brunel University, Uxbridge, Middlesex, UK
Tracy Hall & Gernot Liebchen
School of Computer Science, University of Hertfordshire, Hatfield, Hertfordshire, UK
David Bowes & Paul Wernick

Authors

Tracy Hall
View author publications
You can also search for this author in PubMed Google Scholar
David Bowes
View author publications
You can also search for this author in PubMed Google Scholar
Gernot Liebchen
View author publications
You can also search for this author in PubMed Google Scholar
Paul Wernick
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Software Development Group, IT University of Copenhagen, Rued Langgaards Vej 7, 2300, Copenhagen, Denmark
M. Ali Babar
VTT Technical Research Centre of Finland, Kaitoväylä 1, 90570, Oulu, Finland
Matias Vierimaa
Department of Information Processing Science, University of Oulu, P.O. Box 3000, 90014, Oulu, Finland
Markku Oivo

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hall, T., Bowes, D., Liebchen, G., Wernick, P. (2010). Evaluating Three Approaches to Extracting Fault Data from Software Change Repositories. In: Ali Babar, M., Vierimaa, M., Oivo, M. (eds) Product-Focused Software Process Improvement. PROFES 2010. Lecture Notes in Computer Science, vol 6156. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13792-1_10

Download citation

DOI: https://doi.org/10.1007/978-3-642-13792-1_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-13791-4
Online ISBN: 978-3-642-13792-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics