Learning a graph-based classifier for fault localization

Zhong, Hao; Mei, Hong

doi:10.1007/s11432-019-2720-1

Learning a graph-based classifier for fault localization

Research Paper
Published: 09 May 2020

Volume 63, article number 162101, (2020)
Cite this article

Science China Information Sciences Aims and scope Submit manuscript

Hao Zhong¹ &
Hong Mei¹

213 Accesses
13 Citations
Explore all metrics

Abstract

Because software emerged, locating software faults has been intensively researched, culminating in various approaches and tools that have been applied in real development. Despite the success of these developments, improved tools are still demanded by programmers. Meanwhile, some programmers are reluctant to use any tools when locating faults in their development. The state-of-the-art situation can be naturally improved by learning how programmers locate faults. The rapid development of open-source software has accumulated many bug fixes. A bug fix is a specific type of comments containing a set of buggy files and their corresponding fixed files, which reveal how programmers repair bugs. Feasibly, an automatic model can learn fault locations from bug fixes, but prior attempts to achieve this vision have been prevented by various technical challenges. For example, most bug fixes are not compilable after checking out, which hinders analyzing bug fixes by most advanced static/dynamic tools. This paper proposes an approach called ClaFa that trains a graph-based fault classifier from bug fixes. ClaFa is built on a recent partial-code tool called Grapa, which enables the analysis of partial programs by the complete code tool called WALA. Once Grapa has built a program dependency graph from a bug fix, ClaFa compares the graph from the buggy code with the graph from the fixed code, locates the buggy nodes, and extracts the various graph features of the buggy and clean nodes. Based on the extraction result, ClaFa trains a classifier that combines Adaboost and decision tree learning. The trained ClaFa can predict whether a node of a program dependency graph is buggy or clean. We evaluate ClaFa on thousands of buggy files collected from four open-source projects: Aries, Mahout, Derby, and Cassandra. The f-scores of ClaFa achieves are approximately 80% on all projects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A Public Bug Database of GitHub Projects and Its Application in Bug Prediction

A LambdaMart-Based High-Accuracy Approach for Software Automatic Fault Localization

Utilizing source code syntax patterns to detect bug inducing commits using machine learning models

Article 31 December 2022

References

Hovemeyer D, Pugh W. Finding bugs is easy. In: Proceedings of Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), 2004. 132–136
Google Scholar
DiGiuseppe N, Jones J A. On the influence of multiple faults on coverage-based fault localization. In: Proceedings of International Symposium on Software Testing and Analysis (ISSTA), 2011. 210–220
Google Scholar
Abreu R, Zoeteweij P, van Gemund A J C. Spectrum-based multiple fault localization. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering, 2009. 88–99
Chapter Google Scholar
Do H, Elbaum S, Rothermel G. Supporting controlled experimentation with testing techniques: an infrastructure and its potential impact. Empir Softw Eng, 2005, 10: 405–435
Article Google Scholar
Wang Q, Parnin C, Orso A. Evaluating the usefulness of IR-based fault localization techniques. In: Proceedings of International Symposium on Software Testing and Analysis (ISSTA), 2015. 1–11
Google Scholar
Johnson B, Song Y, Murphy-Hill E, et al. Why don’t software developers use static analysis tools to find bugs? In: Proceedings of the International Conference on Software Engineering (ICSE), 2013. 672–681
Google Scholar
Rochkind M J. The source code control system. IEEE Trans Softw Eng, 1975, 1: 364–370
Article Google Scholar
Wu R, Zhang H, Kim S, et al. Relink: recovering links between bugs and changes. In: Proceedings of ESEC/FSE, 2011. 15–25
Google Scholar
Tian Y, Lawall J, Lo D. Identifying linux bug fixing patches. In: Proceedings of the 34th International Conference on Software Engineering (ICSE), 2012. 386–396
Google Scholar
Mei H, Zhang L. Can big data bring a breakthrough for software automation? Sci China Inf Sci, 2018, 61: 056101
Article Google Scholar
Guo P J, Zimmermann T, Nagappan N, et al. Characterizing and predicting which bugs get fixed: an empirical study of microsoft windows. In: Proceedings of the International Conference on Software Engineering (ICSE), 2010. 495–504
Google Scholar
Zhong H, Su Z. An empirical study on real bug fixes. In: Proceedings of the International Conference on Software Engineering (ICSE), 2015. 913–923
Google Scholar
Martinez M, Monperrus M. Mining software repair models for reasoning on the search space of automated program fixing. Empir Softw Eng, 2015, 20: 176–205
Article Google Scholar
Rahm E, Do H H. Data cleaning: problems and current approaches. IEEE Data Eng Bullet, 2000, 23: 3–13
Google Scholar
Ottenstein K J, Ottenstein L M. The program dependence graph in a software development environment. ACM SIGPLAN Not, 1984, 19: 177–184
Article Google Scholar
Tufano M, Palomba F, Bavota G, et al. There and back again: can you compile that snapshot? J Softw Evol Proc, 2017, 29: e1838
Article Google Scholar
Hsu H-Y, Jones J A, Orso A. Rapid: identifying bug signatures to support debugging activities. In: Proceedings of the 23rd IEEE/ACM International Conference on Automated Software Engineering, 2008. 439–442
Google Scholar
Sun C, Khoo S-C. Mining succinct predicated bug signatures. In: Proceedings of the 2013 9th Joint Meeting on Foundations of Software Engineering, 2013. 576–586
Chapter Google Scholar
Hutchins M, Foster H, Goradia T, et al. Experiments of the effectiveness of dataflow-and controlflow-based test adequacy criteria. In: Proceedings of the International Conference on Software Engineering (ICSE), 1994. 191–200
Google Scholar
Li J, Ernst M D. CBCD: cloned buggy code detector. In: Proceedings of the International Conference on Software Engineering (ICSE), 2012. 310–320
Google Scholar
Fluri B, Wuersch M, PInzger M, et al. Change distilling: tree differencing for fine-grained source code change extraction. IEEE Trans Softw Eng, 2007, 33: 725–743
Article Google Scholar
Mishne A, Shoham S, Yahav E. Typestate-based semantic code search over partial programs. In: Proceedings of Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), 2012. 997–1016
Google Scholar
Dagenais B, Hendren L J. Enabling static analysis for partial Java programs. In: Proceedings of Conference on Object-Oriented Programming Systems, Languages, and Applications (OOPSLA), 2008. 313–328
Google Scholar
Yang Q, Wu X. 10 challenging problems in data mining research. Int J Info Tech Dec Mak, 2006, 05: 597–604
Article Google Scholar
Zhong H, Wang X. Boosting complete-code tools for partial program. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering, 2017. 671–681
Google Scholar
Zhong H, Meng N. Towards reusing hints from past fixes. Empir Softw Eng, 2018, 23: 2521–2549
Article Google Scholar
Wang Y, Meng N, Zhong H. An empirical study of multi-entity changes in real bug fixes. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution (ICSME), 2018
Google Scholar
Kim D S, Tao Y D, Kim S H, et al. Where should we fix this bug? A two-phase recommendation model. IEEE Trans Softw Eng, 2013, 39: 1597–1610
Article Google Scholar
Hao D, Xie T, Zhang L, et al. Test input reduction for result inspection to facilitate fault localization. Autom Softw Eng, 2010, 17: 5–31
Article Google Scholar
Pearson S, Campos J, Just R, et al. Evaluating and improving fault localization. In: Proceedings of the International Conference on Software Engineering (ICSE), 2017. 609–620
Google Scholar
Berglund A, Boag S, Chamberlin D, et al. XML path language (xpath). World Wide Web Consortium (W3C), 2003
Google Scholar
Lovins J B. Development of a stemming algorithm. Mech Transl Comput Linguist, 1968, 11: 1–10
Google Scholar
Newman D, Asuncion A, Smyth P, et al. Distributed algorithms for topic models. J Mach Learn Res, 2009, 10: 1801–1828
MathSciNet MATH Google Scholar
Nguyen A T, Nguyen T T, Al-Kofahi J, et al. A topic-based approach for narrowing the search space of buggy files from a bug report. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering, 2011. 263–272
Google Scholar
Kuhn H W. The Hungarian method for the assignment problem. Naval Res Logist, 1955, 2: 83–97
Article MathSciNet MATH Google Scholar
Mikolov T, Chen K, Corrado G, et al. Efficient estimation of word representations in vector space. 2013. ArXiv: 1301.3781
Google Scholar
Gu Z, Barr E T, Hamilton D J, et al. Has the bug really been fixed? In: Proceedings of the 32nd International Conference on Software Engineering (ICSE), 2010. 55–64
Google Scholar
He H B, Garcia E A. Learning from imbalanced data. IEEE Trans Knowl Data Eng, 2009, 21: 1263–1284
Article Google Scholar
Chawla N V, Bowyer K W, Hall L O, et al. SMOTE: synthetic minority over-sampling technique. J Artif Intell Res, 2002, 16: 321–357
Article MATH Google Scholar
Liu X-Y, Wu J X, Zhou Z-H. Exploratory undersampling for class-imbalance learning. IEEE Trans Syst Man Cybern B, 2009, 39: 539–550
Article Google Scholar
Mease D, Wyner A J, Buja A. Boosted classification trees and class probability/quantile estimation. J Mach Learn Res, 2007, 8: 409–439
MATH Google Scholar
Sun Y, Kamel M S, Wong A K C, et al. Cost-sensitive boosting for classification of imbalanced data. Pattern Recogn, 2007, 40: 3358–3378
Article MATH Google Scholar
Weiss G M. Mining with rarity: a unifying framework. ACM SIGKDD Explor Newsletter, 2004, 6: 7–19
Article Google Scholar
Frank E. Pruning decision trees and lists. Dissertation for Ph.D. Degree. Hamilton: University of Waikato, 2000
Google Scholar
Freund Y, Schapire R E. Experiments with a new boosting algorithm. In: Proceedings of International Conference on Machine Learning, San Francisco, 1996. 148–156
Google Scholar
Di Nucci D, Palomba F, Tamburri D A, et al. Detecting code smells using machine learning techniques: are we there yet? In: Proceedings of the 25th IEEE International Conference on Software Analysis, Evolution, and Reengineering, 2018. 612–621
Google Scholar
Di Nucci D, Palomba F, de Rosa G, et al. A developer centered bug prediction model. IEEE Trans Softw Eng, 2018, 44: 5–24
Article Google Scholar
Hassan A E. Predicting faults using the complexity of code changes. In: Proceedings of the International Conference on Software Engineering (ICSE), 2009. 78–88
Google Scholar
Lucia L, Lo D, Jiang L, et al. Extended comprehensive study of association measures for fault localization. J Softw Evol Proc, 2014, 26: 172–219
Article Google Scholar
Di Giuseppe N, Jones J A. Fault density, fault types, and spectra-based fault localization. Empir Softw Eng, 2015, 20: 928–967
Article Google Scholar
Wang S, Liu T, Tan L. Automatically learning semantic features for defect prediction. In: Proceedings of the International Conference on Software Engineering (ICSE), 2016. 297–308
Google Scholar
Benesty J, Chen J, Huang Y, et al. Pearson correlation coefficient. In: Noise Reduction in Speech Processing. Berlin: Springer, 2009. 1–4
Google Scholar
Hall M A. Correlation-based feature selection for machine learning. Dissertation for Ph.D. Degree. 1999
Google Scholar
Zhong H, Zhang L, Xie T, et al. Inferring resource specifications from natural language API documentation. In: Proceedings of the 24th IEEE/ACM International Conference on Automated Software Engineering, 2009. 307–318
Google Scholar
Platt J C. Fast training of support vector machines using sequential minimal optimization. Advances in Kernel Methods, 1999. 185–208
Google Scholar
Suykens J A K, Vandewalle J. Least squares support vector machine classifiers. Neural Process Lett, 1999, 9: 293–300
Article Google Scholar
John G H, Langley P. Estimating continuous distributions in bayesian classifiers. In: Proceedings of the 11th Conference on Uncertainty in Artificial Intelligence, 1995. 338–345
Google Scholar
Kohavi R. The power of decision tables. In: Proceedings of the 8th European Conference on Machine Learning, 1995. 174–189
Google Scholar
Le Cessie S, van Houwelingen J C. Ridge estimators in logistic regression. Appl Stat, 1992, 41: 191–201
Article MATH Google Scholar
Fawcett T. An introduction to ROC analysis. Pattern Recogn Lett, 2006, 27: 861–874
Article Google Scholar
Flach P A, Wu S. Repairing concavities in roc curves. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, 2005. 702–707
Google Scholar
Ghotra B, McIntosh S, Hassan A E. Revisiting the impact of classification techniques on the performance of defect prediction models. In: Proceedings of the International Conference on Software Engineering (ICSE), 2015. 789–800
Google Scholar
Hall T, Beecham S, Bowes D, et al. A systematic literature review on fault prediction performance in software engineering. IEEE Trans Softw Eng, 2012, 38: 1276–1304
Article Google Scholar
Rao S, Kak A. Retrieval from software libraries for bug localization: a comparative study of generic and composite text models. In: Proceedings of the 8th International Working Conference on Mining Software Repositories, 2011. 43–52
Chapter Google Scholar
Zhou J, Zhang H, Lo D. Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports. In: Proceedings of the International Conference on Software Engineering (ICSE), 2012. 14–24
Google Scholar
Wong C, Xiong Y, Zhang H, et al. Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution (ICSME), 2014. 181–190
Google Scholar
Sisman B, Kak A C. Incorporating version histories in information retrieval based bug localization. In: Proceedings of 9th IEEE Working Conference on Mining Software Repositories, 2012. 50–59
Google Scholar
Kim S, Zimmermann T, Whitehead Jr E J, et al. Predicting faults from cached history. In: Proceedings of the 29th International Conference on Software Engineering (ICSE), 2007. 489–498
Google Scholar
Bachmann A, Bird C, Rahman F, et al. The missing links: bugs and bug-fix commits. In: Proceedings of the 18th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2010. 97–106
Google Scholar
Antoniol G, Ayari K, Di Penta M D, et al. Is it a bug or an enhancement? a text-based approach to classify change requests. In: Proceedings of Conference of the Center for Advanced Studies on Collaborative Research, 2008. 304–318
Google Scholar
Herzig K, Just S, Zeller A. It’s not a bug, it’s a feature: how misclassification impacts bug prediction. In: Proceedings of the International Conference on Software Engineering (ICSE), 2013. 392–401
Google Scholar
Weimer W, Nguyen T, Le Goues C, et al. Automatically finding patches using genetic programming. In: Proceedings of the International Conference on Software Engineering (ICSE), 2009. 364–374
Google Scholar
Qi Y, Mao X, Lei Y, et al. The strength of random search on automated program repair. In: Proceedings of the 36th International Conference on Software Engineering (ICSE), 2014. 254–265
Chapter Google Scholar
Sarro F, Di Martino S, Ferrucci F, et al. A further analysis on the use of genetic algorithm to configure support vector machines for inter-release fault prediction. In: Proceedings of the 27th Annual ACM Symposium on Applied Computing, 2012. 1215–1220
Chapter Google Scholar
Tantithamthavorn C, McIntosh S, Hassan A E, et al. Automated parameter optimization of classification techniques for defect prediction models. In: Proceedings of the International Conference on Software Engineering (ICSE), 2016. 321–332
Google Scholar
Thornton C, Hutter F, Hoos H H, et al. Auto-WEKA: combined selection and hyperparameter optimization of classification algorithms. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2013. 847–855
Chapter Google Scholar
Tantithamthavorn C, McIntosh S, Hassan A E, et al. The impact of automated parameter optimization on defect prediction models. IEEE Trans Softw Eng, 2019, 45: 683–711
Article Google Scholar
Le T-D B, Oentaryo R J, Lo D. Information retrieval and spectrum based bug localization: better together. In: Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering, 2015. 579–590
Chapter Google Scholar
Shapiro E. Algorithmic program debugging. Dissertation for Ph.D. Degree. New Haven: Yale University, 1983
MATH Google Scholar
Wong W E, Gao R, Li Y, et al. A survey on software fault localization. IEEE Trans Softw Eng, 2016, 42: 707–740
Article Google Scholar
Jones J A, Harrold M J, Stasko J. Visualization of test information to assist fault localization. In: Proceedings of the International Conference on Software Engineering (ICSE), 2002. 467–477
Google Scholar
Naish L, Lee H J, Ramamohanarao K. A model for spectra-based software diagnosis. ACM Trans Softw Eng Methodol, 2011, 20: 1–32
Article Google Scholar
Wong W E, Debroy V, Xu D. Towards better fault localization: a crosstab-based statistical approach. IEEE Trans Syst Man Cybern C, 2012, 42: 378–396
Article Google Scholar
Abreu R, Zoeteweij P, van Gemund A J C. An evaluation of similarity coefficients for software fault localization. In: Proceedings of the 12th Pacific Rim International Symposium on Dependable Computing, 2006. 39–46
Google Scholar
Abreu R, Zoeteweij P, Golsteijn R, et al. A practical evaluation of spectrum-based fault localization. J Syst Softw, 2009, 82: 1780–1792
Article Google Scholar
Wong W E, Qi Y. BP neural network-based effective fault localization. Int J Soft Eng Knowl Eng, 2009, 19: 573–597
Article Google Scholar
Mao X, Lei Y, Dai Z, et al. Slice-based statistical fault localization. J Syst Softw, 2014, 89: 51–62
Article Google Scholar
Dickinson W, Leon D, Podgurski A. Finding failures by cluster analysis of execution profiles. In: Proceedings of the International Conference on Software Engineering (ICSE), 2001. 339–348
Google Scholar
Gao R, Wong W E. MSeer-an advanced technique for locating multiple bugs in parallel. IEEE Trans Softw Eng, 2019, 45: 301–318
Article Google Scholar
Debroy V, Wong W E. Insights on fault interference for programs with multiple bugs. In: Proceedings of IEEE International Conference on Software Reliability Engineering, 2009. 165–174
Google Scholar
Perez A, Abreu R, d’Amorim M. Prevalence of single-fault fixes and its impact on fault localization. In: Proceedings of IEEE International Conference on Software Testing, 2017. 12–22
Google Scholar
Just R, Parnin C, Drosos I, et al. Comparing developer-provided to user-provided tests for fault localization and automated program repair. In: Proceedings of International Symposium on Software Testing and Analysis (ISSTA), 2018. 287–297
Google Scholar
Campos J, Abreu R, Fraser G, et al. Entropy-based test generation for improved fault localization. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013. 257–267
Google Scholar
Perez A, Abreu R, van Deursen A. A test-suite diagnosability metric for spectrum-based fault localization approaches. In: Proceedings of the International Conference on Software Engineering (ICSE), 2017. 654–664
Google Scholar
Lukins S K, Kraft N A, Etzkorn L H. Bug localization using latent dirichlet allocation. Inf Softw Tech, 2010, 52: 972–990
Article Google Scholar
Wang S, Lo D, Lawall J. Compositional vector space models for improved bug localization. In: Proceedings of IEEE International Conference on Software Maintenance and Evolution (ICSME), 2014. 171–180
Google Scholar
Saha R K, Lease M, Khurshid S, et al. Improving bug localization using structured information retrieval. In: Proceedings of IEEE/ACM International Conference on Automated Software Engineering (ASE), 2013. 345–355
Google Scholar
Wang S, Lo D. AmaLgam+: composing rich information sources for accurate bug localization. J Softw Evol Proc, 2016, 28: 921–942
Article Google Scholar
Ammons G, Bod´ık R, Larus J R. Mining specifications. In: Proceedings of the 29th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages, 2002. 4–16
Chapter Google Scholar
Pandita R, Xiao X, Zhong H, et al. Inferring method specifications from natural language API descriptions. In: Proceedings of the 34th International Conference on Software Engineering (ICSE), 2012. 815–825
Google Scholar
Nguyen T T, Nguyen H A, Pham N H, et al. Graph-based mining of multiple object usage patterns. In: Proceedings of the 7th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, 2009. 383–392
Chapter Google Scholar
Nguyen H V, Nguyen H A, Nguyen A T, et al. Mining interprocedural, data-oriented usage patterns in JavaScript web applications. In: Proceedings of the International Conference on Software Engineering (ICSE), 2014. 791–802
Chapter Google Scholar
Corbett J C, Dwyer M B, Hatcliff J, et al. Bandera: Extracting finite-state models from Java source code. In: Proceedings of the 22nd International Conference on Software Engineering (ICSE), 2000. 439–448
Chapter Google Scholar
Robillard M P, Bodden E, Kawrykow D, et al. Automated API property inference techniques. IEEE Trans Softw Eng, 2013, 39: 613–637
Article Google Scholar
Li Z, Zhou Y. PR-Miner: automatically extracting implicit programming rules and detecting violations in large software code. In: Proceedings of the 10th European Software Engineering Conference Held Jointly With 13th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2005. 306–315
Chapter Google Scholar
Saied A, Benomar O, Abdeen H, et al. Mining multi-level API usage patterns. In: Proceedings of IEEE 22nd International Conference on Software Analysis, Evolution, and Reengineering (SANER), 2015. 23–32
Google Scholar
Engler D, Chen D, Chou A. Bugs as inconsistent behavior: a general approach to inferring errors in systems code. In: Proceedings of 18th Symposium on Operating Systems Principles, 2001. 57–72
Google Scholar
Wasylkowski A, Zeller A, Lindig C. Detecting object usage anomalies. In: Proceedings of the 6th Joint Meeting of the European Software Engineering Conference and the ACM SIGSOFT Symposium on The Foundations of Software Engineering, 2007. 35–44
Google Scholar
Ramanathan M, Grama A, Jagannathan S. Path-sensitive inference of function precedence protocols. In: Proceedings of the 29th International Conference on Software Engineering (ICSE), 2007. 240–250
Google Scholar
Maoz S, Ringert J O. GR(1) synthesis for LTL specification patterns. In: Proceedings of the 10th Joint Meeting on Foundations of Software Engineering, 2015. 96–106
Google Scholar
Lemieux C, Park D, Beschastnikh I. General LTL specification mining. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015. 81–92
Google Scholar
Agrawal R, Srikant R. Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering, 1995. 3–14
Google Scholar
Ernst M D, Perkins J H, Guo P J, et al. The Daikon system for dynamic detection of likely invariants. Sci Comput Programm, 2007, 69: 35–45
Article MathSciNet MATH Google Scholar
Le T, Le X, Lo D, et al. Synergizing specification miners through model fissions and fusions. In: Proceedings of the 30th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2015. 115–125
Google Scholar
Dallmeier V, Knopp N, Mallon C, et al. Generating test cases for specification mining. In: Proceedings of the 19th International Symposium on Software Testing and Analysis, 2010. 85–96
Chapter Google Scholar
Pradel M, Gross T R. Leveraging test generation and specification mining for automated bug detection without false positives. In: Proceedings of the International Conference on Software Engineering (ICSE), 2012. 288–298
Google Scholar
Br¨unink M, Rosenblum D S. Mining performance specifications. In: Proceedings of the 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering, 2016. 39–49
Google Scholar
Pham N H, Nguyen T T, Nguyen H A, et al. Detecting recurring and similar software vulnerabilities. In: Proceedings of the International Conference on Software Engineering (ICSE), 2010. 227–230
Google Scholar
Cheng H, Lo D, Zhou Y, et al. Identifying bug signatures using discriminative graph mining. In: Proceedings of International Symposium on Software Testing and Analysis (ISSTA), 2009. 141–152
Google Scholar
Zuo Z, Khoo S-C, Sun C. Efficient predicated bug signature mining via hierarchical instrumentation. In: Proceedings of International Symposium on Software Testing and Analysis (ISSTA), 2014. 215–224
Google Scholar
El Emam K, Melo W, Machado J C. The prediction of faulty classes using object-oriented design metrics. J Syst Softw, 2001, 56: 63–75
Article Google Scholar
Marcus A, Poshyvanyk D, Ferenc R. Using the conceptual cohesion of classes for fault prediction in object-oriented systems. IEEE Trans Softw Eng, 2008, 34: 287–300
Article Google Scholar
Nagappan N, Ball T, Zeller A. Mining metrics to predict component failures. In: Proceedings of the International Conference on Software Engineering (ICSE), 2006. 452–461
Google Scholar
Rahman F, Posnett D, Hindle A, et al. Bugcache for inspections: hit or miss? In: Proceedings of the 19th ACM SIGSOFT Symposium and the 13th European Conference on Foundations of Software Engineering, 2011. 322–331
Chapter Google Scholar
Hayes J H, Dekhtyar A, Osborne J. Improving requirements tracing via information retrieval. In: Proceedings of 11th IEEE International Requirements Engineering Conference, 2003. 138–147
Google Scholar
Williams C C, Hollingsworth J K. Automatic mining of source code repositories to improve bug finding techniques. IEEE Trans Softw Eng, 2005, 31: 466–480
Article Google Scholar
Last M, Friedman M, Kandel A. The data mining approach to automated software testing. In: Proceedings of the 9th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2003. 388–396
Google Scholar
Podgurski A, Leon D, Francis P, et al. Automated support for classifying software failure reports. In: Proceedings of the 25th International Conference on Software Engineering (ICSE), 2003. 465–475
Google Scholar
Hindle A, German D M, Holt R. What do large commits tell us? a taxonomical study of large commits. In: Proceedings of the 2008 International Working Conference on Mining Software Repositories, 2008. 99–108
Chapter Google Scholar
Menzies T, Di Stefano J S. More success and failure factors in software reuse. IEEE Trans Softw Eng, 2003, 29: 11474–477
Article Google Scholar

Download references

Acknowledgements

This work was sponsored by National Key R&D Program of China (Grant No. 2018YFC0830500), National Nature Science Foundation of China (Grant No. 61572313), and Science and Technology Commission of Shanghai Municipality (Grant No. 15DZ1100305). We appreciated the anonymous reviewers for their constructive comments.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
Hao Zhong & Hong Mei

Authors

Hao Zhong
View author publications
You can also search for this author in PubMed Google Scholar
Hong Mei
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hao Zhong.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhong, H., Mei, H. Learning a graph-based classifier for fault localization. Sci. China Inf. Sci. 63, 162101 (2020). https://doi.org/10.1007/s11432-019-2720-1

Download citation

Received: 21 July 2019
Revised: 22 September 2019
Accepted: 21 November 2019
Published: 09 May 2020
DOI: https://doi.org/10.1007/s11432-019-2720-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Learning a graph-based classifier for fault localization

Abstract

Access this article

Similar content being viewed by others

A Public Bug Database of GitHub Projects and Its Application in Bug Prediction

A LambdaMart-Based High-Accuracy Approach for Software Automatic Fault Localization

Utilizing source code syntax patterns to detect bug inducing commits using machine learning models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Learning a graph-based classifier for fault localization

Abstract

Access this article

Similar content being viewed by others

A Public Bug Database of GitHub Projects and Its Application in Bug Prediction

A LambdaMart-Based High-Accuracy Approach for Software Automatic Fault Localization

Utilizing source code syntax patterns to detect bug inducing commits using machine learning models

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation