Skip to main content
Log in

Tackling class overlap and imbalance problems in software defect prediction

  • Published:
Software Quality Journal Aims and scope Submit manuscript

Abstract

Software defect prediction (SDP) is a promising solution to save time and cost in the software testing phase for improving software quality. Numerous machine learning approaches have proven effective in SDP. However, the unbalanced class distribution in SDP datasets could be a problem for some conventional learning methods. In addition, class overlap increases the difficulty for the predictors to learn the defective class accurately. In this study, we propose a new SDP model which combines class overlap reduction and ensemble imbalance learning to improve defect prediction. First, the neighbor cleaning method is applied to remove the overlapping non-defective samples. The whole dataset is then randomly under-sampled several times to generate balanced subsets so that multiple classifiers can be trained on these data. Finally, these individual classifiers are assembled with the AdaBoost mechanism to build the final prediction model. In the experiments, we investigated nine highly unbalanced datasets selected from a public software repository and confirmed that the high rate of overlap between classes existed in SDP data. We assessed the performance of our proposed model by comparing it with other state-of-the-art methods including conventional SDP models, imbalance learning and data cleaning methods. Test results and statistical analysis show that the proposed model provides more reasonable defect prediction results and performs best in terms of G-mean and AUC among all tested models.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Arar, O. F., & Ayan, K. (2015). Software defect prediction using cost-sensitive neural network. Applied Soft Computing, 33, 263–277.

    Article  Google Scholar 

  • Catal, C., & Diri, B. (2009). A systematic review of software fault prediction studies. Expert Systems with Applications, 36(4), 7346–7354.

    Article  Google Scholar 

  • Chawla, N. V., Bowyer, K. W., Hall, L. O., & Kegelmeyer, W. P. (2002). SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research, 16, 321–357.

    MATH  Google Scholar 

  • Chawla, N. V., Lazarevic, A., Hall, L. O., & Bowyer, K. W. (2003). SMOTEBoost: Improving prediction of the minority class in boosting. In European Conference on Principles of Data Mining and Knowledge Discovery (pp. 107–119). Dubrovnik: Springer.

  • Conover, W. J., & Iman, R. L. (1981). Rank transformations as a bridge between parametric and nonparametric statistics. The American Statistician, 35(3), 124–129.

    MATH  Google Scholar 

  • Demšar, J. (2006). Statistical comparisons of classifiers over multiple data sets. The Journal of Machine Learning Research, 7, 1–30.

    MathSciNet  MATH  Google Scholar 

  • Denil, M., & Trappenberg, T. (2010). Overlap versus imbalance. In Proceedings of Advances in Artificial Intelligence, Canadian Conference on Artificial Intelligence, Canadian, Ai 2010, Ottawa, Canada, May 31June 2, 2010 (pp. 220–231).

  • Drown, D. J., Khoshgoftaar, T. M., & Seliya, N. (2009). Evolutionary sampling and software quality modeling of high-assurance systems. IEEE Transactions on Systems Man and Cybernetics Part a-Systems and Humans, 39(5), 1097–1107.

    Article  Google Scholar 

  • Fawcett, T. (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8), 861–874.

    Article  MathSciNet  Google Scholar 

  • Fenton, N. E., & Ohlsson, N. (2000). Quantitative analysis of faults and failures in a complex software system. IEEE Transactions on Software Engineering, 26(8), 797–814.

    Article  Google Scholar 

  • Freund, Y., & Schapire, R. E. (1995). A desicion-theoretic generalization of on-line learning and an application to boosting. In European conference on computational learning theory (pp. 23–37). London: Springer.

  • Ghotra, B., McIntosh, S., & Hassan, A. E. (2015). Revisiting the impact of classification techniques on the performance of defect prediction models. In Proceedings of the 37th International Conference on Software Engineering-Volume 1 (pp. 789–800). Piscataway: IEEE.

  • Gondra, I. (2008). Applying machine learning to software fault-proneness prediction. Journal of Systems and Software, 81(2), 186–195.

    Article  Google Scholar 

  • Gray, D., Bowes, D., Davey, N., Sun, Y., & Christianson, B. (2011). The misuse of the NASA metrics data program data sets for automated software defect prediction. In 15th Annual Conference on Evaluation and Assessment in Software Engineering (EASE 2011) (pp. 96–103). Durham: IET.

  • Hall, T., Beecham, S., Bowes, D., Gray, D., & Counsell, S. (2012). A systematic literature review on fault prediction performance in software engineering. IEEE Transactions on Software Engineering, 38(6), 1276–1304.

    Article  Google Scholar 

  • Halstead, M. H. (1977). Elements of software science (Vol. 7). New York: Elsevier.

    MATH  Google Scholar 

  • He, H., & Garcia, E. (2009). Learning from imbalanced data. IEEE Transactions on Knowledge and Data Engineering, 21(9), 1263–1284.

    Article  Google Scholar 

  • Japkowicz, N., & Stephen, S. (2002). The class imbalance problem: A systematic study. Intelligent Data Analysis, 6(5), 429–449.

    MATH  Google Scholar 

  • Jing, X., Wu, F., Dong, X., Qi, F., & Xu, B. (2015). Heterogeneous cross-company defect prediction by unified metric representation and CCA-based transfer learning. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (pp. 496–507). New York: ACM.

  • Kampenes, V. B., Dybå, T., Hannay, J. E., & Sjøberg, D. I. (2007). A systematic review of effect size in software engineering experiments. Information and Software Technology, 49(11), 1073–1086.

    Article  Google Scholar 

  • Khoshgoftaar, T. M., Gao, K., & Seliya, N. (2010). Attribute selection and imbalanced data: Problems in software defect prediction. In 2010 22nd IEEE International Conference on Tools with Artificial Intelligence (Vol. 1, pp. 137–144). Arras: IEEE.

  • Kim, S., Zhang, H., Wu, R., & Gong, L. (2011). Dealing with noise in defect prediction. In 2011 33rd International Conference on Software Engineering (ICSE) (pp. 481–490). New York: IEEE.

  • Kubat, M., & Matwin, S. (1997). Addressing the curse of imbalanced training sets: one-sided selection. In ICML (Vol. 97, pp. 179–186). Nashville.

  • Laurikkala, J. (2001). Improving identification of difficult small classes by balancing class distribution. Berlin: Springer.

    Book  MATH  Google Scholar 

  • Lewis, D. D. (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. In European conference on machine learning (pp. 4–15). Chemnitz: Springer.

  • Liu, M. X., Miao, L. S., & Zhang, D. Q. (2014). Two-Stage cost-Sensitive learning for Software defect prediction. IEEE Transactions on Reliability, 63(2), 676–686.

    Article  Google Scholar 

  • Liu, X., Wu, J., & Zhou, Z. (2009). Exploratory undersampling for class-imbalance learning. IEEE Transactions on Systems, Man, and Cybernetics. Part B, Cybernetics, 39(2), 539–550.

    Article  Google Scholar 

  • López, V., Fernández, A., García, S., Palade, V., & Herrera, F. (2013). An insight into classification with imbalanced data: Empirical results and current trends on using data intrinsic characteristics. Information Sciences, 250(11), 113–141.

    Article  Google Scholar 

  • Malhotra, R. (2015). A systematic review of machine learning techniques for software fault prediction. Applied Soft Computing, 27, 504–518.

    Article  Google Scholar 

  • McCabe, T. J. (1976). A complexity measure. IEEE Transactions on Software Engineering, 4, 308–320.

    Article  MathSciNet  MATH  Google Scholar 

  • Menzies, T., Butcher, A., Cok, D., Marcus, A., Layman, L., Shull, F., et al. (2013). Local versus global lessons for defect prediction and effort estimation. IEEE Transactions on Software Engineering, 39(6), 822–834.

    Article  Google Scholar 

  • Menzies, T., Caglayan, B., Kocaguneli, E., Krall, J., Peters, F., & Turhan, B. (2012). The promise repository of empirical software engineering data. promisedata. googlecode. com.

  • Menzies, T., Greenwald, J., & Frank, A. (2007). Data mining static code attributes to learn defect predictors. IEEE Transactions on Software Engineering, 33(1), 2–13.

    Article  Google Scholar 

  • Nam, J., & Kim, S. (2015). Heterogeneous defect prediction. In Proceedings of the 2015 10th Joint Meeting on Foundations of Software Engineering (pp. 508–519). New York: ACM.

  • Pelayo, L., & Dick, S. (2007). Applying novel resampling strategies to software defect prediction. In Fuzzy Information Processing Society, 2007. NAFIPS’07. Annual Meeting of the North American (pp. 69–72). San Diego: IEEE.

  • Pelayo, L., & Dick, S. (2012). Evaluating stratification alternatives to improve software defect prediction. IEEE Transactions on Reliability, 61(2), 516–525.

    Article  Google Scholar 

  • Prati, R. C., Batista, G. E. A. P. A., & Monard, M. C. (2004). Class imbalances versus class overlapping: An analysis of a learning system behavior. Lecture Notes in Computer Science, 2972, 312–321.

  • Ryu, D., Choi, O., & Baik, J. (2016). Value-cognitive boosting with a support vector machine for cross-project defect prediction. Empirical Software Engineering, 21(1), 43–71.

    Article  Google Scholar 

  • Ryu, D., Jang, J.-I., & Baik, J. (2015). A transfer cost-sensitive boosting approach for cross-project defect prediction. Software Quality Journal, 1–38.

  • Seiffert, C., Khoshgoftaar, T. M., & Van Hulse, J. (2009). Improving software-quality predictions with data sampling and boosting. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 39(6), 1283–1294.

    Article  Google Scholar 

  • Seiffert, C., Khoshgoftaar, T. M., Van Hulse, J., & Napolitano, A. (2010). RUSBoost: A hybrid approach to alleviating class imbalance. IEEE Transactions on Systems, Man, and Cybernetics Part A: Systems and Humans, 40(1), 185–197.

    Article  Google Scholar 

  • Shepperd, M., & Ince, D. C. (1994). A critique of three metrics. Journal of Systems and Software, 26(3), 197–210.

    Article  Google Scholar 

  • Shepperd, M., Song, Q., Sun, Z., & Mair, C. (2013). Data quality: Some comments on the nasa software defect datasets. IEEE Transactions on Software Engineering, 39(9), 1208–1215.

    Article  Google Scholar 

  • Siers, M. J., & Islam, M. Z. (2015). Software defect prediction using a cost sensitive decision forest and voting, and a potential solution to the class imbalance problem. Information Systems, 51, 62–71.

    Article  Google Scholar 

  • Song, Q., Jia, Z., Shepperd, M., Ying, S., & Liu, S. Y. J. (2011). A general software defect-proneness prediction framework. IEEE Transactions on Software Engineering, 37(3), 356–370.

    Article  Google Scholar 

  • Srinivasan, K., & Fisher, D. (1995). Machine learning approaches to estimating software development effort. IEEE Transactions on Software Engineering, 21(2), 126–137.

    Article  Google Scholar 

  • Tan, M., Tan, L., Dara, S., & Mayeux, C. (2015). Online defect prediction for imbalanced data. In Proceedings of the 37th International Conference on Software Engineering-Volume 2 (pp. 99–108). Piscataway: IEEE.

  • Tang, W., & Khoshgoftaar, T. M. (2004). Noise identification with the k-means algorithm. In ICTAI 2004. 16th IEEE International Conference on Tools with Artificial Intelligence (pp. 373–378). Boca Raton: IEEE.

  • Turhan, B., Menzies, T., Bener, A. B., & Di Stefano, J. (2009). On the relative value of cross-company and within-company data for defect prediction. Empirical Software Engineering, 14(5), 540–578.

    Article  Google Scholar 

  • Turhan, B., Mısırlı, A. T., & Bener, A. (2013). Empirical evaluation of the effects of mixed project data on learning defect predictors. Information and Software Technology, 55(6), 1101–1118.

    Article  Google Scholar 

  • Wang, S., & Yao, X. (2013). Using Class Imbalance Learning for Software Defect Prediction. IEEE Transactions on Reliability, 62(2), 434–443.

    Article  Google Scholar 

  • Zhang, F., Zheng, Q., Zou, Y., & Hassan, A. E. (2016). Cross-project defect prediction using a connectivity-based unsupervised classifier. In Proceedings of the 38th International Conference on Software Engineering (pp. 309–320). New York: ACM.

Download references

Acknowledgments

This paper is supported by National Key Basic Research Program of China (973 program 2013CB329103 of 2013CB329100), the Program for Natural Science Foundation of China (No. 61672120, No. 61472053, No. 91118005), the Doctoral Program of Higher Education (20120191110027) and Natural Science Foundation of Chongqing (No. CSTC2010BB2217, No. cstc2012jjA40017).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Lin Chen.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Chen, L., Fang, B., Shang, Z. et al. Tackling class overlap and imbalance problems in software defect prediction. Software Qual J 26, 97–125 (2018). https://doi.org/10.1007/s11219-016-9342-6

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11219-016-9342-6

Keywords

Navigation