Skip to main content

Towards Benchmarking Feature Subset Selection Methods for Software Fault Prediction

  • Chapter
  • First Online:
Computational Intelligence and Quantitative Software Engineering

Part of the book series: Studies in Computational Intelligence ((SCI,volume 617))

Abstract

Despite the general acceptance that software engineering datasets often contain noisy, irrelevant or redundant variables, very few benchmark studies of feature subset selection (FSS) methods on real-life data from software projects have been conducted. This paper provides an empirical comparison of state-of-the-art FSS methods: information gain attribute ranking (IG); Relief (RLF); principal component analysis (PCA); correlation-based feature selection (CFS); consistency-based subset evaluation (CNS); wrapper subset evaluation (WRP); and an evolutionary computation method, genetic programming (GP), on five fault prediction datasets from the PROMISE data repository. For all the datasets, the area under the receiver operating characteristic curve—the AUC value averaged over 10-fold cross-validation runs—was calculated for each FSS method-dataset combination before and after FSS. Two diverse learning algorithms, C4.5 and naïve Bayes (NB) are used to test the attribute sets given by each FSS method. The results show that although there are no statistically significant differences between the AUC values for the different FSS methods for both C4.5 and NB, a smaller set of FSS methods (IG, RLF, GP) consistently select fewer attributes without degrading classification accuracy. We conclude that in general, FSS is beneficial as it helps improve classification accuracy of NB and C4.5. There is no single best FSS method for all datasets but IG, RLF and GP consistently select fewer attributes without degrading classification accuracy within statistically significant boundaries.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    The requirement that the number of training data points to be an exponential function of the feature dimension.

  2. 2.

    Section 4 provides more details about AUC.

References

  1. Khoshgoftaar, T.M., Seliya, N.: Fault prediction modeling for software quality estimation: Comparing commonly used techniques. Empirical Softw. Eng. 8(3), 255–283 (2004)

    Article  Google Scholar 

  2. Catal, C., Diri, B.: A systematic review of software fault prediction studies. Expert Syst. Appl. 36(4), 7346–7354 (2009)

    Article  Google Scholar 

  3. Hall, T., Beecham, S., Bowes, D., Gray, D., Counsell, S.: A systematic review of fault prediction performance in software engineering. IEEE Trans. Softw. Eng. (99) (2011)

    Google Scholar 

  4. Lessmann, S., Baesens, B., Mues, C., Pietsch, S.: Benchmarking classification models for software defect prediction: a proposed framework and novel findings. IEEE Trans. Softw. Eng. 34(4), 485–496 (2008)

    Article  Google Scholar 

  5. Fenton, N.E., Neil, M.: A critique of software defect prediction models. IEEE Trans. Softw. Eng. 25(5), 675–689 (1999)

    Article  Google Scholar 

  6. Song, Q., Jia, Z., Shepperd, M., Ying, S., Liu, J.: A general software defect-proneness prediction framework. IEEE Trans. Softw. Eng. 37(3), 356–370 (2011)

    Article  Google Scholar 

  7. Foss, T., Stensrud, E., Kitchenham, B.A., Myrtveit, I.: A simulation study of the model evaluation criterion MMRE. IEEE Trans. Softw. Eng. 29(11) (2003)

    Google Scholar 

  8. Afzal, W., Torkar, R., Feldt, R.: Resampling methods in software quality classification. Int. J. Software Eng. Knowl. Eng. 22, 203–223 (2012)

    Article  Google Scholar 

  9. Gray, D., Bowes, D., Davey, N., Sun, Y., Christianson, B.: The misuse of the NASA metrics data program data sets for automated software defect prediction. IET Semin. Dig. 1, 96–103 (2011)

    Google Scholar 

  10. Khoshgoftaar, T.M., Gao, K., Seliya, N.: Attribute selection and imbalanced data: Problems in software defect prediction. IEEE Computer Society, Los Alamitos, CA, USA (2010)

    Google Scholar 

  11. Shivaji, S., Whitehead, J.E.J, Akella, R., Kim, S. Reducing features to improve bug prediction. In: Proceedings of the 2009 IEEE/ACM International Conference on Automated Software Engineering (ASE’09), IEEE Computer Society, Washington, DC, USA (2009)

    Google Scholar 

  12. Rodriguez, D., Ruiz, R., Cuadrado-Gallego, J., Aguilar-Ruiz, J.: Detecting fault modules applying feature selection to classifiers. In: IEEE International Conference on Information Reuse and Integration (IRI’07) (2007a)

    Google Scholar 

  13. Rodriguez, D., Ruiz, R., Cuadrado-Gallego, J., Aguilar-Ruiz, J., Garre, M.: Attribute selection in software engineering datasets for detecting fault modules. In: 33rd EUROMICRO Conference on Software Engineering and Advanced Applications (EUROMICRO’07) (2007b)

    Google Scholar 

  14. Kohavi, R., John, G.H.: Wrappers for feature subset selection. Artif. Intell. 97, 273–324 (1997)

    Article  MATH  Google Scholar 

  15. Hall, M.A., Holmes, G.: Benchmarking attribute selection techniques for discrete class data mining. IEEE Trans. Knowl. Data Eng. 15, 1437–1447 (2003)

    Article  Google Scholar 

  16. Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: a review. IEEE Trans. Pattern Anal. Mach. Intell. 22, 4–37 (2000)

    Article  Google Scholar 

  17. Chen, Z., Boehm, B., Menzies, T., Port, D.: Finding the right data for software cost modeling. IEEE Softw. 22, 38–46 (2005)

    Article  Google Scholar 

  18. Janecek, A., Gansterer, W., Demel, M., Ecker, G.: On the relationship between feature selection and classification accuracy. In: Proceedings of the 3rd Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery (FSDM’08), Microtome Publishing, Brookline, MA, USA (2008)

    Google Scholar 

  19. Burke, E.K., Kendall, G. (eds.): Search methodologies—Introductory tutorials in optimization and decision support techniques. Springer Science and Business Media, Inc., 233 Spring Street, New York, USA (2005)

    Google Scholar 

  20. Dybå, T., Kampenes, V.B., Sjøberg, D.I.: A systematic review of statistical power in software engineering experiments. Inf. Softw. Technol. 48(8), 745–755 (2006)

    Article  Google Scholar 

  21. Afzal, W., Torkar, R., Feldt, R., Gorschek, T.: Genetic programming for cross-release fault count predictions in large and complex software projects. In: Chis, M. (ed.) Evolutionary Computation and Optimization Algorithms in Software Engineering: Applications and Techniques, pp. 94–126. IGI Global, Hershey, USA (2009)

    Google Scholar 

  22. Muni, D., Pal, N., Das, J.: Genetic programming for simultaneous feature selection and classifier design. IEEE Trans. Syst. Man Cybern. B Cybern. 36(1), 106–117 (2006)

    Article  Google Scholar 

  23. Smith, M.G., Bull. L.: Feature construction and selection using genetic programming and a genetic algorithm. In: Proceedings of the 6th European Conference on Genetic Programming (EuroGP’03), Springer-Verlag, Berlin, Heidelberg (2003)

    Google Scholar 

  24. Vivanco, R., Kamei, Y., Monden, A., Matsumoto, K., Jin, D.: Using search-based metric selection and oversampling to predict fault prone modules. In: 2010 23rd Canadian Conference on Electrical and Computer Engineering (CCECE’10) (2010)

    Google Scholar 

  25. Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. IEEE Intell. Syst. and Their Appl. 13(2), 44–49 (1998)

    Article  Google Scholar 

  26. Boetticher, G., Menzies, T., Ostrand, T.: PROMISE repository of empirical software engineering data. http://promisedata.org/ repository, West Virginia University, Department of Computer Science (2007)

  27. Molina, L.C., Belanche, L., Nebot, Àngela: Feature selection algorithms: a survey and experimental evaluation. Proceedings of the 2002 IEEE International Conference on Data Mining (ICDM’02), pp. 306–313. IEEE Computer Society, Washington, DC, USA (2002)

    Chapter  Google Scholar 

  28. Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. J. Mach. Learn. Res. 3, 1157–1182 (2003)

    MATH  Google Scholar 

  29. Blum, A.L., Langley, P.: Selection of relevant features and examples in machine learning. Artif. Intell. 97, 245–271 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  30. Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1(1–4), 131–156 (1997)

    Article  Google Scholar 

  31. Liu, H., Yu, L.: Toward integrating feature selection algorithms for classification and clustering. IEEE Trans. Knowl. Data Eng. 17(4), 491–502 (2005)

    Article  Google Scholar 

  32. Dejaeger, K., Verbeke, W., Martens, D., Baesens, B.: Data mining techniques for software effort estimation: a comparative study. IEEE Trans. Softw. Eng. 38, 375–397 (2012)

    Article  Google Scholar 

  33. Chen, Z., Menzies, T., Port, D., Boehm, B.: Feature subset selection can improve software cost estimation accuracy. SIGSOFT Softw. Eng. Notes 30(4), 1–6 (2005)

    Google Scholar 

  34. Menzies, T., Jalali, O., Hihn, J., Baker, D., Lum, K.: Stable rankings for different effort models. Autom. Softw. Eng. 17, 409–437 (2010)

    Article  Google Scholar 

  35. Kirsopp, C., Shepperd, M.J., Hart, J.: Search heuristics, case-based reasoning and software project effort prediction. Proceedings of the 2002 Genetic and Evolutionary Computation Conference (GECCO’02), pp. 1367–1374. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2002)

    Google Scholar 

  36. Azzeh, M., Neagu, D., Cowling, P.: Improving analogy software effort estimation using fuzzy feature subset selection algorithm. In: Proceedings of the 4th International Workshop on Predictor Models in Software Engineering (PROMISE’08), ACM, New York, NY, USA (2008)

    Google Scholar 

  37. Li, Y., Xie, M., Goh, T.: A study of mutual information based feature selection for case based reasoning in software cost estimation. Expert Systems with Applications 36(3, Part 2):5921–5931 (2009)

    Google Scholar 

  38. Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Softw. Eng. 33(1), 2–13 (2007)

    Article  Google Scholar 

  39. Catal, C., Diri, B.: Investigating the effect of dataset size, metrics sets, and feature selection techniques on software fault prediction problem. Inf. Sci. 179, 1040–1058 (2009)

    Article  Google Scholar 

  40. Khoshgoftaar, T.M., Seliya, N., Sundaresh, N.: An empirical study of predicting software faults with case-based reasoning. Softw. Qual. Control 14, 85–111 (2006)

    Article  Google Scholar 

  41. Wang, H., Khoshgoftaar, T., Gao, K., Seliya, N.: High-dimensional software engineering data and feature selection. In: 21st International Conference on Tools with Artificial Intelligence (ICTAI’09), pp. 83–90 (2009)

    Google Scholar 

  42. Khoshgoftaar, T.M., Nguyen, L., Gao, K., Rajeevalochanam, J.: Application of an attribute selection method to CBR-based software quality classification. In: Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence (ICTAI’03), IEEE Computer Society, Washington, DC, USA (2003)

    Google Scholar 

  43. Altidor, W., Khoshgoftaar, T.M., Gao, K.: Wrapper-based feature ranking techniques for determining relevance of software engineering metrics. Int. J. Reliab. Qual. Saf. Eng. 17, 425–464 (2010)

    Article  Google Scholar 

  44. Gao, K., Khoshgoftaar, T., Seliya, N.: Predicting high-risk program modules by selecting the right software measurements. Softw. Qual. J. 20, 3–42 (2012)

    Article  Google Scholar 

  45. Gao, K., Khoshgoftaar, T.M., Wang, H., Seliya, N.: Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw. Pract. Experience 41(5), 579–606 (2011)

    Article  Google Scholar 

  46. Khoshgoftaar, T.M., Gao, K., Napolitano, A.: An empirical study of feature ranking techniques for software quality prediction. Int. J. Softw. Eng. Knowl. Eng. (IJSEKE) 22, 161–183 (2012)

    Article  Google Scholar 

  47. Wang, H., Khoshgoftaar, T.M., Napolitano, A.: Software measurement data reduction using ensemble techniques. Neurocomputing 92, 124–132 (2012)

    Article  Google Scholar 

  48. Quinlan, J.R.: C4.5: programs for machine learning. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1993)

    Google Scholar 

  49. Novakovic, J.: Using information gain attribute evaluation to classify sonar targets. In: Proceedings of the 17th Telecommunications forum (TELFOR’09) (2009)

    Google Scholar 

  50. Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In: Proceedings of the 10th National Conference on Artificial Intelligence (AAAI’92) (1992)

    Google Scholar 

  51. Sikonja, M., Kononenko, I.: An adaptation of relief for attribute estimation in regression. In: Proceedings of the 14th International Conference on Machine Learning (ICML’97) (1997)

    Google Scholar 

  52. Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 2000 International Conference on Machine Learning (ICML’00), Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (2000)

    Google Scholar 

  53. Liu, H., Setiono, R.: A probabilistic approach to feature selection—A filter solution. Proceedings of the 1996 International Conference on Machine Learning (ICML’96), pp. 319–327. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1996)

    Google Scholar 

  54. Poli, R., Langdon, W.B., McPhee, N.F.: A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk. URL: http://www.gp-field-guide.org.uk, (with contributions by Koza, J.R.) (2008)

  55. Koza, J.R.: Genetic programming: on the programming of computers by means of natural selection. MIT Press, Cambridge, MA, USA (1992)

    MATH  Google Scholar 

  56. Silva, S.: GPLAB—A genetic programming toolbox for MATLAB. http://gplab.sourceforge.net, Last checked: 22 Dec 2014 (2007)

  57. Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29(2–3), 131–163 (1997)

    Article  MATH  Google Scholar 

  58. Rish, I.: An empirical study of the naive Bayes classifier. In: Proceedings of the workshop on empirical methods in AI (IJCAI’01) (2001)

    Google Scholar 

  59. Kotsiantis, S., Zaharakis, I., Pintelas, P.: Machine learning: a review of classification and combining techniques. Artif. Intell. Rev. 26(3), 159–190 (2007)

    Article  Google Scholar 

  60. Hall, M., Frank, E., Holmes, G., Pfahringer, B., Reutemann, P., Witten, I.H.: The WEKA data mining software: an update. SIGKDD Explor. Newsl. 11, 10–18 (2009)

    Article  Google Scholar 

  61. Menzies, T., DiStefano, J., Orrego, A., Chapman, R.M.: Assessing predictors of software defects. In: Proceedings of the Workshop on Predictive Software Models, collocated with ICSM’04. URL: http://menzies.us/pdf/04psm.pdf (2004)

  62. El-Emam, K., Benlarbi, S., Goel, N., Rai, S.N.: Comparing case-based reasoning classifiers for predicting high risk software components. J. Syst. Softw. 55(3), 301–320 (2001)

    Article  Google Scholar 

  63. Ma, Y., Cukic, B.: Adequate and precise evaluation of quality models in software engineering studies. In: Proceedings of the 3rd International Workshop on Predictor Models in Software Engineering (PROMISE’07), IEEE Computer Society, pp 1, Washington, DC, USA(2007)

    Google Scholar 

  64. Fawcett, T.: An introduction to ROC analysis. Pattern Recogn. Lett. 27(8), 861–874 (2006)

    Article  MathSciNet  Google Scholar 

  65. Hanley, J.A., McNeil, B.J.: The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143(1):29–36 (1982)

    Google Scholar 

  66. Ling, C.X., Huang, J., Zhang, H.: AUC: a statistically consistent and more discriminating measure than accuracy. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI’03) (2003)

    Google Scholar 

  67. Yousef, W.A., Wagner, R.F., Loew, M.H.: Comparison of non-parametric methods for assessing classifier performance in terms of ROC parameters. In: Proceedings of the 33rd Applied Imagery Pattern Recognition Workshop (AIPR’04), IEEE Computer Society, Washington, DC, USA (2004)

    Google Scholar 

  68. Jiang, Y., Cukic, B., Menzies, T., Bartlow, N.: Comparing design and code metrics for software quality prediction. In: Proceedings of the 4th international workshop on predictor models in software engineering (PROMISE’08), ACM, New York, NY, USA (2008)

    Google Scholar 

  69. Jiang, Y., Cukic, B., Menzies, T.: Fault prediction using early lifecycle data. In: Proceedings of the 18th IEEE International Symposium on Software Reliability (ISSRE’07), IEEE Computer Society, Washington, DC, USA (2007)

    Google Scholar 

  70. Bradley, A.P.: The use of the area under the ROC curve in the evaluation of machine learning algorithms. Pattern Recogn. 30, 1145–1159 (1997)

    Article  Google Scholar 

  71. Kitchenham, B.A., Pickard, L.M., MacDonell, S., Shepperd, M.: What accuracy statistics really measure? IEE Proc. Softw. 148(3) (2001)

    Google Scholar 

  72. Myrtveit, I., Stensrud, E., Shepperd, M.: Reliability and validity in comparative studies of software prediction models. IEEE Trans. Softw. Eng. 31(5), 380–391 (2005)

    Article  Google Scholar 

  73. Langdon, W.B., Buxton, B.F.: Genetic programming for mining DNA chip data from cancer patients. Genet. Program Evolvable Mach. 5, 251–257 (2004)

    Article  Google Scholar 

  74. Wohlin, C., Runeson, P., Höst, M., Ohlsson, M., Regnell, B., Wesslén, A.: Experimentation in software engineering: an introduction. Kluwer Academic Publishers, USA (2000)

    Book  MATH  Google Scholar 

  75. Kohavi, R.: A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th International Joint conference on Artificial Intelligence (IJCAI’95), Morgan Kaufmann Publishers Inc., San Francisco, CA, USA (1995)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Wasif Afzal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Afzal, W., Torkar, R. (2016). Towards Benchmarking Feature Subset Selection Methods for Software Fault Prediction. In: Pedrycz, W., Succi, G., Sillitti, A. (eds) Computational Intelligence and Quantitative Software Engineering. Studies in Computational Intelligence, vol 617. Springer, Cham. https://doi.org/10.1007/978-3-319-25964-2_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-25964-2_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-25962-8

  • Online ISBN: 978-3-319-25964-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics