A hybrid approach to software fault prediction using genetic programming and ensemble learning methods

Sahu, Satya Prakash; Reddy, B. Ramachandra; Mukherjee, Dev; Shyamla, D. M.; Verma, Bhim Singh

doi:10.1007/s13198-021-01532-x

A hybrid approach to software fault prediction using genetic programming and ensemble learning methods

Original article
Published: 04 January 2022

Volume 13, pages 1746–1760, (2022)
Cite this article

International Journal of System Assurance Engineering and Management Aims and scope Submit manuscript

Satya Prakash Sahu¹,
B. Ramachandra Reddy ORCID: orcid.org/0000-0002-2595-8325^1,2,
Dev Mukherjee¹,
D. M. Shyamla¹ &
…
Bhim Singh Verma¹

377 Accesses
1 Citation
Explore all metrics

Abstract

Software fault prediction techniques use previous software metrics and also use the fault data to predict fault-prone modules for the next release of software. In this article we review the literature that uses machine-learning techniques to find the defect, fault, ambiguous code, inappropriate branching and prospected runtime errors to establish a level of quality in software. This paper also proposes a hybrid technique for software fault prediction which is based on genetic programming and ensemble learning techniques. There are multiple software fault prediction (machine-learning) techniques available to predict the occurrence of faults. Our experiments perform a comparative study of the performance achieved by simple ensemble methods, simple genetic programming based classification and the hybrid approach. We find that machine learning techniques have different learning abilities that can be exploited by software professionals and researchers for software fault prediction. We find that the performance obtained by this proposed approach is superior to the simple statistical and ensemble techniques used in the automated fault prediction system. However, more studies should be performed on lesser used machine learning techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Effectiveness of Ensemble Classifier Over State-Of-Art Machine Learning Classifiers for Predicting Software Faults in Software Modules

A sequential ensemble model for software fault prediction

Article 28 March 2021

Using Ensemble of Different Classifiers for Defect Prediction

References

Adeli H, Hung SL (1994) Machine learning: neural networks, genetic algorithms, and fuzzy systems. Wiley (1994)
Akour M, Alsmadi I, Alazzam I (2017) Software fault proneness prediction: a comparative study between bagging, boosting, and stacking ensemble and base learner methods. Int J Data Anal Tech Strateg 9(1):1–16
Article Google Scholar
Aleem S, Capretz LF, Ahmed F (2015) Benchmarking machine learning techniques for software defect detection. Int J Softw Eng Appl 6(3)
Arar ÖF, Ayan K (2015) Software defect prediction using cost-sensitive neural network. Appl Soft Comput 33:263–277
Article Google Scholar
Arisholm E, Briand LC, Fuglerud M (2007) Data mining techniques for building fault-proneness models in telecom java software. In: The 18th IEEE international symposium on software reliability, 2007. ISSRE'07 (pp. 215–224). IEEE (2007, November)
Bal PR, Mohapatra DP (2017) Software reliability prediction based on radial basis function neural network. In: Advances in computational intelligence. Springer, Singapore, pp 101–110
Bal PR, Jena N, Mohapatra DP (2017) Software reliability prediction based on ensemble models. In: Proceeding of international conference on intelligent communication, control and devices, Springer, Singapore, pp 895–902
Blickle T (1997) Theory of evolutionary algorithms and application to system synthesis (No. 17). vdf Hochschulverlag AG
Boser BE, Guyon IM, Vapnik VN (1992) A training algorithm for optimal margin classifiers. In; Proceedings of the fifth annual workshop on computational learning theory (pp 144–152). ACM
Breiman L (1996) Bagging predictors. Mach Learn 24(2):123–140
MATH Google Scholar
Cano A, Zafra A, Ventura S (2012) Speeding up the evaluation phase of GP classification algorithms on GPUs. Soft Comput 16(2):187–202
Article Google Scholar
Chidamber SR, Kemerer CF (1994) A metrics suite for object-oriented design. IEEE Trans Softw Eng 20(6):476–493
Article Google Scholar
Choudhary GR, Kumar S, Kumar K, Mishra A, Catal C (2018) Empirical analysis of change metrics for software fault prediction. Comput Electr Eng 67:15–24
Article Google Scholar
Di Martino S, Ferrucci F, Gravino C, Sarro F (2011) A genetic algorithm to configure support vector machines for predicting fault-prone components. In: International conference on product focused software process improvement (pp 247–261). Springer, Berlin
Drucker H, Cortes C, Jackel LD, LeCun Y, Vapnik V (1994) Boosting and other ensemble methods. Neural Comput 6(6):1289–1301
Article Google Scholar
Girija SS (2016) Tensorflow: large-scale machine learning on heterogeneous distributed systems
Guo L, Cukic B, Singh H (2003) Predicting fault prone modules by the dempster-shafer belief networks. In: Proceedings of the 18th IEEE international conference on automated software engineering, 2003, pp 249–252. IEEE
Gyimothy T, Ferenc R, Siket I (2005) Empirical validation of object-oriented metrics on open source software for fault prediction. IEEE Trans Softw Eng 31(10):897–910
Article Google Scholar
Jabangwe R, Börstler J, Šmite D, Wohlin C (2015) Empirical evidence on the link between object-oriented measures and external quality attributes: a systematic literature review. Empir Softw Eng 20(3):640–693
Article Google Scholar
Kleinberg EM (2000) On the algorithmic implementation of stochastic discrimination. IEEE Trans Pattern Anal Mach Intell 5:473–490
Article Google Scholar
Kpodjedo S, Ricca F, Galinier P, Guéhéneuc YG, Antoniol G (2011) Design evolution metrics for defect prediction in object oriented systems. Empir Softw Eng 16(1):141–175
Article Google Scholar
Kulamala VK, Teja ASC, Maru A, Singla Y, Mohapatra DP (2018) Predicting software reliability using computational intelligence techniques: a review. In: 2018 international conference on information technology (ICIT), IEEE, pp 114–119
Kumar KV, Kumari P, Chatterjee A, Mohapatra DP (2021) Software fault prediction using random forests. In: Intelligent and cloud computing. Springer, Singapore, pp 95–103
Kumaresh, S., Baskaran, R., Sivaguru, M.: Software Defect Classification using Bayesian Classification Techniques.
Li M, Zhang H, Wu R, Zhou ZH (2012) Sample-based software defect prediction with active and semi-supervised learning. Autom Softw Eng 19(2):201–230
Article Google Scholar
Maddipati SS, Pradeepini G, Yesubabu A (2018) Software defect prediction using adaptive neuro fuzzy inference system. Int J Appl Eng Res 13(1):394–397
Malhotra R (2015) A systematic review of machine learning techniques for software fault prediction. Appl Soft Comput 27:504–518
Article Google Scholar
Mitchell TM (1997) Machine learning. WCB
Murillo-Morera J, Jenkins M (2015) A software defect-proneness prediction framework: a new approach using genetic algorithms to generate learning schemes. In: SEKE, pp 445–450
Purohit A, Chaudhari NS, Tiwari A (2010) Construction of classifier with feature selection based on genetic programming. In: 2010 IEEE congress on evolutionary computation (CEC) (pp 1–5). IEEE, (2010)
Quinlan JR (1986) Induction of decision trees. Mach Learn 1(1):81–106
Google Scholar
Quinlan JR (1987) Simplifying decision trees. Int J Man Mach Stud 27(3):221–234
Article Google Scholar
Rathore SS, Kumar S (2019) A study on software fault prediction techniques. Artif Intell Rev 51(2):255–327
Article Google Scholar
Ridella S, Rovetta S, Zunino R (1997) Circular backpropagation networks for classification. IEEE Trans Neural Netw 8(1):84–97
Article Google Scholar
Rodríguez D, Ruiz R, Riquelme JC, Aguilar-Ruiz JS (2012) Searching for rules to detect defective modules: a subgroup discovery approach. Inf Sci 191:14–30
Article Google Scholar
Rojas R (2009) AdaBoost and the super bowl of classifiers a tutorial introduction to adaptive boosting. Freie University, Berlin, Tech. Rep
Sathyaraj R, Prabu S (2015) An approach for software fault prediction to measure the quality of different prediction methodologies using software metrics. Indian J Sci Technol 8(35)
Sherer SA (1995) Software fault prediction. J Syst Softw 29(2):97–105
Article Google Scholar
Singh Y, Kaur A, Malhotra R (2010) Empirical validation of object-oriented metrics for predicting fault proneness models. Softw Qual J 18(1):3
Article Google Scholar
Song Q, Jia Z, Shepperd M, Ying S, Liu J (2011) A general software defect-proneness prediction framework. IEEE Trans Softw Eng 37(3):356–370
Article Google Scholar
Specht DF (1988) Probabilistic neural networks for classification, mapping, or associative memory. In: IEEE international conference on neural networks (Vol. 1, No. 24, pp 525–532)
Stephens T (2016) Genetic Programming in Python, with a scikit-learn inspired API: gplearn, 2016–. [Online; accessed 21.6.2017]
Turhan B, Bener A (2009) Analysis of Naive Bayes’ assumptions on software fault data: an empirical study. Data Knowl Eng 68(2):278–290
Article Google Scholar
Twala B (2011) Software faults prediction using multiple classifiers. In: 2011 3rd international conference on computer research and development (ICCRD) (Vol. 4, pp 504–510). IEEE
Vandecruys O, Martens D, Baesens B, Mues C, De Backer M, Haesen R (2008) Mining software repositories for comprehensible software fault prediction models. J Syst Softw 81(5):823–839
Article Google Scholar
Zadeh LA (1996) Fuzzy logic, neural networks, and soft computing. In: Fuzzy Sets, Fuzzy logic, and fuzzy systems: selected papers by Lotfi A Zadeh (pp 775–782)
Zhou Y, Xu B, Leung H (2010) On the ability of complexity metrics to predict fault-prone classes in object-oriented systems. J Syst Softw 83(4):660–674
Article Google Scholar

Download references

Funding

There was no funding support from any agencies.

Author information

Authors and Affiliations

Department of Information Technology, National Institute of Technology, Raipur, Raipur, Chhattisgarh, India
Satya Prakash Sahu, B. Ramachandra Reddy, Dev Mukherjee, D. M. Shyamla & Bhim Singh Verma
Department of Computer Science and Engineering, SRM University AP, Amaravati, Andhra Pradesh, India
B. Ramachandra Reddy

Authors

Satya Prakash Sahu
View author publications
You can also search for this author in PubMed Google Scholar
B. Ramachandra Reddy
View author publications
You can also search for this author in PubMed Google Scholar
Dev Mukherjee
View author publications
You can also search for this author in PubMed Google Scholar
D. M. Shyamla
View author publications
You can also search for this author in PubMed Google Scholar
Bhim Singh Verma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Satya Prakash Sahu or B. Ramachandra Reddy.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sahu, S.P., Reddy, B.R., Mukherjee, D. et al. A hybrid approach to software fault prediction using genetic programming and ensemble learning methods. Int J Syst Assur Eng Manag 13, 1746–1760 (2022). https://doi.org/10.1007/s13198-021-01532-x

Download citation

Received: 16 May 2019
Revised: 16 June 2021
Accepted: 18 November 2021
Published: 04 January 2022
Issue Date: August 2022
DOI: https://doi.org/10.1007/s13198-021-01532-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid approach to software fault prediction using genetic programming and ensemble learning methods

Abstract

Access this article

Similar content being viewed by others

Effectiveness of Ensemble Classifier Over State-Of-Art Machine Learning Classifiers for Predicting Software Faults in Software Modules

A sequential ensemble model for software fault prediction

Using Ensemble of Different Classifiers for Defect Prediction

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A hybrid approach to software fault prediction using genetic programming and ensemble learning methods

Abstract

Access this article

Similar content being viewed by others

Effectiveness of Ensemble Classifier Over State-Of-Art Machine Learning Classifiers for Predicting Software Faults in Software Modules

A sequential ensemble model for software fault prediction

Using Ensemble of Different Classifiers for Defect Prediction

References

Funding

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation