Abstract
In order to filter noisy and redundant genes, this paper presents a two-step gene feature selection algorithm based on permutation Test. The proposed algorithm can select genes efficiently and process large dataset quickly due to the permutation test technique. Twelve datasets of RSCTC 2010 Discovery Challenge and two famous classifiers SVM and PAM are adopted to evaluate the performance of the proposed algorithm. The experiment results show that the small gene subset with high discriminant and low redundancy can be selected efficiently by the proposed algorithm.
Part of this work is supported by National Natural Science Foundation of China (No. 61073146), Cooperation Project between China and Poland in Science and Technology ([2010]179)
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Wojnarski, M., Janusz, A., Nguyen, H.S., Bazan, J., Luo, C., Chen, Z., Hu, F., Wang, G., Guan, L., Luo, H., Gao, J., Shen, Y., Nikulin, V., Huang, T.-H., McLachlan, G.J., Bošnjak, M., Gamberger, D.: RSCTC’2010 Discovery Challenge: Mining DNA Microarray Data for Medical Diagnosis and Treatment. In: Szczuka, M., Kryszkiewicz, M., Ramanna, S., Jensen, R., Hu, Q. (eds.) RSCTC 2010. LNCS (LNAI), vol. 6086, pp. 4–19. Springer, Heidelberg (2010)
Li, W.T., Yang, Y.: How many genes are needed for a discriminant microarray data analysis? In: Critical Assessment of Techniques for Microarray Data Mining Workshop, pp. 137–150 (2002)
Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. PNAS 98(9), 5116–5121 (2001)
Tibshirani, R., Hastie, T., Narasimhan, B., et al.: Diagnosis of multiple cancer types by shrunken centroids of gene expression. PNAS 99(10), 6567–6572 (2002)
Peng, H.C., Long, F.H., Ding, C.H.: Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Transactions on Pattern Analysis and Machine Intelligence 27(8), 1226–1238 (2005)
Jaeger, J., Sengupta, R., Ruzzo, W.L.: Improved Gene Selection for Classification of Microarrays. In: Pacific Symposium on Biocomputing, vol. 8, pp. 53–64 (2003)
Mitra, P., Murthy, C.A., Pal, S.K.: Unsupervised Feature Selection Using Feature Similarity. IEEE Transactions on Pattern Analysis and Machine Intelligence 24(3), 301–311 (2002)
Hall, M.A., Smith, L.A.: Feature Selection for Machine Learning: Comparing a Correlation-Based Filter Approach to the Wrapper. In: Proceedings of the Twelfth International Florida Artificial Intelligence Research Society Conference, pp. 235–239 (1999)
Hall, M.A.: Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning. In: Proceeding ICML 2000 Proceedings of the Seventeenth International Conference on Machine Learning, pp. 359–366 (2000)
Wikipedia, http://en.wikipedia.org/wiki/Lilliefors_test
Anderson, M.J.: Permutation tests for univariate or multivariate analysis of variance and regression. Canadian Journal of Fisheries and Aquatic Sciences 58(3), 626–639 (2001)
Artiemjew, P.: The Extraction Method of DNA Microarray Features Based on Experimental A Statistics. In: Yao, J., Ramanna, S., Wang, G., Suraj, Z. (eds.) RSKT 2011. LNCS, vol. 6954, pp. 642–648. Springer, Heidelberg (2011)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Luo, C., Wang, G., Hu, F. (2012). Two-Step Gene Feature Selection Algorithm Based on Permutation Test. In: Yao, J., et al. Rough Sets and Current Trends in Computing. RSCTC 2012. Lecture Notes in Computer Science(), vol 7413. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-32115-3_30
Download citation
DOI: https://doi.org/10.1007/978-3-642-32115-3_30
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-32114-6
Online ISBN: 978-3-642-32115-3
eBook Packages: Computer ScienceComputer Science (R0)