Skip to main content

Analysis of Feature Ranking Techniques for Defect Prediction in Software Systems

  • Chapter
  • First Online:
Quality, IT and Business Operations

Abstract

Software quality is an important parameter, and it plays a crucial role in software development. One of the most important software quality attributes is fault proneness. It evaluates the quality of the final product. Fault proneness prediction models must be built in order to enhance software quality. There are various software metrics which help in software modeling, but it is a cumbersome and time-consuming process to use all of them. So, there is always a need to select those set of software metrics which help in determining fault proneness. Careful selection of software metrics is a major concern, and it becomes crucial if the search space is too large. This study focuses on the ranking of software metrics for building defect prediction models. Hybrid approach is applied in which feature ranking techniques are used to reduce the search space along with the feature subset selection methods. Classification algorithms are used for training the defect prediction models. The area which is under the receiver operating characteristic curve is utilized for evaluating the performance of the classifiers. The experimental results indicate that most of the feature ranking techniques have almost similar results, and automatic hybrid search outperforms all other feature subset selection methods. Furthermore, the result helps us to focus only on those set of metrics which have almost the same impact on the end result as compared to the original set of metrics.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Meesad P, Boonrawd P, Nuipian V (2011) A Chi-Square-Test for Word Importance Differentiation in Text Classification. 2011 International conference on information and electronics engineering, Singapore, IPCSIT vol 6

    Google Scholar 

  2. Patil LH, Atique M. A novel feature selection based on information gain using WordNet. Sant Gadge Baba Amravati University, Amravati, India

    Google Scholar 

  3. Gao K, Khoshgoftaar TM, Wang H (2009) An empirical investigation of filter attribute selection techniques for software quality classification. Eastern Connecticut State Univ, Willimantic

    Book  Google Scholar 

  4. Durgabai RPL (2014) Feature selection using relief algorithm. Int J Adv Res Comput Commun Eng 3(10)

    Google Scholar 

  5. Furlanello C, Serafini M, Merler S, Jurman G (2003) Entropy-based gene ranking without selection bias for the predictive classification of microarray data. BMC Bioinforma 4:54

    Article  Google Scholar 

  6. Thangaraju P, Mala N (2015) Effectiveness of searching techniques in feature subset selection. IRACST - Int J Comput Sci Inf Technol Secur (IJCSITS), 5(2). ISSN: 2249-9555

    Google Scholar 

  7. Gao K, Khoshgoftaar TM, Wang H, Seliya N (2011) Choosing software metrics for defect prediction: an investigation on feature selection techniques. Softw Pract Exper 41:579–606. doi:10.1002/spe.1043. Published online in Wiley Online Library (wileyonlinelibrary.com)

    Article  Google Scholar 

  8. Klawonn F, Angelov P (2006) Evolving extending Naïve Bayes classifiers. Data Mining Workshops, 2006. ICDM workshops 2006. Sixth IEEE international conference

    Google Scholar 

  9. Panchal G, Ganatra A, Kosta YP, Devyani P (2011) Behaviour Analysis of Multilayer Perceptrons with Multiple Hidden Neurons and Hidden Layers. Int J Comput Theory Eng, 3(2), ISSN: 1793-8201

    Google Scholar 

  10. Srivastava DK, Bhambhu L. Data classification using support vector machine. J Theor Appl Inf Technol. JATIT 2005–2009

    Google Scholar 

  11. http://promise.site.uottawa.ca/SERepository/datasets/kc1-class-level-numericdefect.arff

  12. Khoshgoftaar TM, Nguyen L, Gao K, Rajeevalochanam J (2003) Application of an attribute selection method to CBR based software quality classification. In: Fifteenth IEEE Conference. Sacramento, CA

    Google Scholar 

  13. R tool: http://www.rdatamining.com/resources/tools

  14. weka: http://www.cs.waikato.ac.nz/ml/weka

  15. Weston J et al (2001) Feature selection for SVMs. In: Todd KL, Thomas GD, Volker T (eds) Advances in neural information processing systems 13. The MIT Press, Cambridge, MA, pp 668–674

    Google Scholar 

  16. Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    Google Scholar 

  17. Rodr’ıguez D, Ruiz R, Cuadrado-Gallego J, Aguilar-Ruiz J, Garre M (2007) Attribute selection in software engineering datasets for detecting fault modules. 33rd EUROMICRO Conference on Software Engineering and Advanced Applications. Germany

    Google Scholar 

  18. Gayatri N, Nickolas S, Reddy AV (2010) Feature selection using decision tree induction in class level metrics dataset for software defect predictions. Proceedings of the world congress on engineering and computer science 2010 vol I WCECS 2010, October 20–22, 2010, San Francisco, USA

    Google Scholar 

  19. Koru AG, Liu H (2005) ―An investigation of the effect of module size on defect prediction using static measures‖, In: Workshop on predictor models in software engineering, St. Louis, Missouri pp. 1–5

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sangeeta Sabharwal .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Sabharwal, S., Nagpal, S., Malhotra, N., Singh, P., Seth, K. (2018). Analysis of Feature Ranking Techniques for Defect Prediction in Software Systems. In: Kapur, P., Kumar, U., Verma, A. (eds) Quality, IT and Business Operations. Springer Proceedings in Business and Economics. Springer, Singapore. https://doi.org/10.1007/978-981-10-5577-5_4

Download citation

Publish with us

Policies and ethics