Abstract
An accurate classification of neuromuscular disorders is important in providing proper treatment facilities to the patients. Recently, the microarray technology is employed to monitor the level of activity or expression of large number of genes simultaneously. The gene expression data derived from the microarray experiment usually involve a large number of genes but a very few number of samples. There is a need to reduce the dimension of gene expression data which intends to find a small set of discriminative genes that accurately classifies the samples of various kinds of diseases. So, our goal is to find a small subset of genes which ensures the accurate classification of neuromuscular disorders. In the present paper, we propose a novel hybrid feature selection model for classification of neuromuscular disorders. The process of feature selection is done in two phases by integrating Bhattacharyya coefficient and genetic algorithm (GA). In the first phase, we find Bhattacharyya coefficient to choose a candidate gene subset by removing the most redundant genes. In the second phase, the target gene subset is created by selecting the most discriminative gene subset by applying GA wherein the fitness function is calculated using radial basis function support vector machine (RBF SVM). The proposed hybrid algorithm is applied on two publicly available microarray neuromuscular disorders datasets. The results are compared with two individual techniques of feature selection, namely Bhattacharyya coefficient and GA, and one integrated technique, i.e., Bhattacharyya-GA wherein the fitness function of GA is calculated using four other classifiers, which shows that the proposed integrated method is capable of giving the better classification accuracy.
Similar content being viewed by others
References
Subasi A (2013) Classification of EMG signals using PSO optimized SVM for diagnosis of neuromuscular disorders. Comput Biol Med 43(5):576–586
Zibakhsh A, Abadeh MS (2013) Gene selection for cancer tumor detection using a novel memetic algorithm with a multi-view fitness function. Eng Appl Artif Intell 26(4):1274–1281
Ahmed FE (2005) Artificial neural networks for diagnosis and survival prediction in colon cancer. Mol Cancer 4(1):29
Yao B, Li S (2010) ANMM4CBR: a case-based reasoning method for gene expression data classification. Algorithms Mol Biol 5(1):1
Tang J, Alelyani S, Liu H (2014) Feature selection for classification: a review. Data Classif Algorithms Appl 37
Hira ZM, Gillies DF (2015) A review of feature selection and feature extraction methods applied on microarray data. Adv Bioinform. doi:10.1155/2015/198363
Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517
Sharma A, Paliwal KK (2008) Cancer classification by gradient LDA technique using microarray gene expression data. Data Knowl Eng 66(2):338–347
Chen AH, Hsu JC (2010) Exploring novel algorithms for the prediction of cancer classification. In: Second international conference on software engineering and data mining (SEDM), IEEE, pp 378–383
Liu B, Cui Q, Jiang T, Ma S (2004) A combinational feature selection and ensemble neural network method for classification of gene expression data. BMC Bioinform 5(1):1
Peterson C, Ringnér M (2003) Analyzing tumor gene expression profiles. Artif Intell Med 28(1):59–74
Chen CK (2012) The classification of cancer stage microarray data. Comput Methods Programs Biomed 108(3):1070–1077
Zheng CH, Chong YW, Wang HQ (2011) Gene selection using independent variable group analysis for tumor classification. Neural Comput Appl 20(2):161–170
Berrar DP, Downes CS, Dubitzky W (2003) Multiclass cancer classification using gene expression profiling and probabilistic neural networks. Proc Pac Symp Biocomput 8:5–16
Azuaje F (2000) Gene expression patterns and cancer classification: a self-adaptive and incremental neural approach. In: Information technology applications in biomedicine, 2000. Proceedings of 2000 IEEE EMBS international conference, pp 308–313
Schaefer G, Nakashima T (2010) Data mining of gene expression data by fuzzy and hybrid fuzzy methods. IEEE Trans Inf Technol Biomed 14(1):23–29
Maglogiannis I, Zafiropoulos E, Anagnostopoulos I (2009) An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. Appl Intell 30(1):24–36
Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422
Zhang JG, Deng HW (2007) Gene selection for classification of microarray data based on the Bayes error. BMC Bioinform 8(1):370
De Paz JF, Bajo J, Vera V, Corchado JM (2011) MicroCBR: a case-based reasoning architecture for the classification of microarray data. Appl Soft Comput 11(8):4496–4507
Daliri MR (2014) A hybrid method for the decoding of spatial attention using the MEG brain signals. Biomed Signal Process Control 10:308–312
González-Navarro FF, Belanche-Muñoz LA, Silva-Colón KA (2013) Effective classification and gene expression profiling for the facioscapulohumeral muscular dystrophy. PLoS One 8(12):e82071
Shanthi D, Sahoo G, Saravanan N (2008) Input feature selection using hybrid neuro-genetic approach in the diagnosis of stroke disease. IJCSNS 8(12):99–107
Mohamad MS, Deris S, Yatim SM, Othman MR (2004) Feature selection method using genetic algorithm for the classification of small and high dimension data. In: Proceedings of the first international symposium on information and communication technology, pp 1–4
Hernandez JC, Duval B, Hao JK (2007) A genetic embedded approach for gene selection and classification of microarray data. In: Marchiori E, Moore JH, Rajapakse JC (eds) Evolutionary computation, machine learning data mining bioinformatics, pp 90–101
Daliri MR (2012) A hybrid automatic system for the diagnosis of lung cancer based on genetic algorithm and fuzzy extreme learning machines. J Med Syst 36(2):1001–1005
Wu J, Li YZ, Li ML, Yu LZ (2009) Two multi-classification strategies used on SVM to predict protein structural classes by using auto covariance. Interdiscip Sci Comput Life Sci 1(4):315–319
Daliri MR (2012) Feature selection using binary particle swarm optimization and support vector machines for medical diagnosis. Biomed Tech Biomed Eng 57(5):395–402
Daliri MR (2012) Predicting the cognitive states of the subjects in functional magnetic resonance imaging signals using the combination of feature selection strategies. Brain Topogr 25(2):129–135
Bakay M, Wang Z, Melcon G et al (2006) Nuclear envelope dystrophies show a transcriptional fingerprint suggesting disruption of Rb–MyoD pathways in muscle regeneration. Brain 129(4):996–1013
Aherne FJ, Thacker NA, Rockett PI (1998) The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34(4):363–368
Babu KG, Prasad MR (2013) An effective approach in face recognition using image processing concepts. Int J Appl Innov Eng Manag 2(8):215–219
Sharma A, Mehta A (2013) Review paper of various selection methods in genetic algorithm. Int J Adv Res Comput Sci Softw Eng 3(7):1476–1479
Fisher RA (1936) The use of multiple measurements in taxonomic problems. Ann Eugen 7(2):179–188
Hsu CW, Lin CJ (2002) A comparison of methods for multiclass support vector machines. IEEE Trans Neural Netw 13(2):415–425
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Anand, D., Pandey, B. & Pandey, D.K. A Novel Hybrid Feature Selection Model for Classification of Neuromuscular Dystrophies Using Bhattacharyya Coefficient, Genetic Algorithm and Radial Basis Function Based Support Vector Machine. Interdiscip Sci Comput Life Sci 10, 244–250 (2018). https://doi.org/10.1007/s12539-016-0183-6
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12539-016-0183-6