Abstract
We extend the standard rough set-based approach to be able to deal with huge amounts of numeric attributes versus small amount of available objects. We transform the training data using a novel way of non-parametric discretization, called roughfication (in contrast to fuzzification known from fuzzy logic). Given roughfied data, we apply standard rough set attribute reduction and then classify the testing data by voting among the obtained decision rules. Roughfication enables to search for reducts and rules in the tables with the original number of attributes and far larger number of objects. It does not require expert knowledge or any kind of parameter tuning or learning. We illustrate it by the analysis of the gene expression data, where the number of genes (attributes) is enormously large with respect to the number of experiments (objects).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Baldi, P., Hatfield, W.G.: DNA Microarrays and Gene Expression: From Experiments to Data Analysis and Modelling. Cambridge University Press, Cambridge (2002)
Chang, J.C., et al.: Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer. The Lancet 362 (2003)
Dietterich, T.: Machine learning research: four current directions. AI Magazine 18(4) (1997)
Draghici, S.: Data Analysis Tools for DNA Microarray. Chapman and Hall, Boca Raton (2003)
Ganter, B., Wille, R.: Formal Concept Analysis: Mathematical Foundations. Springer, Heidelberg (1997)
Grużdź, A., Ihnatowicz, A., Ślȩzak, D.: Interactive gene clustering: A case study of breast cancer microarray data. Information Systems Frontiers 8 (2006)
Fang, J., Grzymala-Busse, J.W.: Leukemia Prediction from Gene Expression Data—A Rough Set Approach. In: Rutkowski, L., et al. (eds.) ICAISC 2006. LNCS (LNAI), vol. 4029, pp. 899–908. Springer, Heidelberg (2006)
Nguyen, H.S.: Approximate Boolean Reasoning: Foundations and Applications in Data Mining. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets V. LNCS, vol. 4100, Springer, Heidelberg (2006)
Pawlak, Z.: Rough sets: Theoretical aspects of reasoning about data. Kluwer Academic Publishers, Dordrecht (1991)
Pawlak, Z., Skowron, A.: Rough membership functions. In: Yaeger, R.R., Fedrizzi, M., Kacprzyk, J. (eds.) Advances in the Dempster Shafer Theory of Evidence, Wiley, Chichester (1994)
Ślȩzak, D.: Approximate reducts in decision tables. In: Proc. of IPMU’96, vol. 3 (1996)
Ślȩzak, D.: Various approaches to reasoning with frequency-based decision reducts: a survey. In: Polkowski, L., Tsumoto, S., Lin, T.Y. (eds.) Rough Set Methods and Applications, Physica-Verlag, Heidelberg (2000)
Ślȩzak, D., Wróblewski, J.: Rough Discretization of Gene Expression Data. In: Proc. of ICHIT’06, vol. 2 (2006)
Słowiński, R., Greco, S., Matarazzo, B.: Rough Set Based Decision Support. In: Introductory Tutorials on Optimization, Search and Decision Support Methodologies, Springer, Heidelberg (2005)
Valdés, J.J., Barton, A.J.: Relevant Attribute Discovery in High Dimensional Data: Application to Breast Cancer Gene Expressions. In: Wang, G.-Y., et al. (eds.) RSKT 2006. LNCS (LNAI), vol. 4062, pp. 482–489. Springer, Heidelberg (2006)
Wojna, A.: Analogy-Based Reasoning in Classifier Construction. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets IV. LNCS, vol. 3700, Springer, Heidelberg (2005)
Wróblewski, J.: Theoretical Foundations of Order-Based Genetic Algorithms. Fundamenta Informaticae 28(3-4) (1996)
Wróblewski, J.: Ensembles of classifiers based on approximate reducts. Fundamenta Informaticae 47(3-4) (2001)
Zadeh, L.A.: Fuzzy Sets. Information and Control 8 (1965)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer Berlin Heidelberg
About this paper
Cite this paper
Ślȩzak, D., Wróblewski, J. (2007). Roughfication of Numeric Decision Tables: The Case Study of Gene Expression Data. In: Yao, J., Lingras, P., Wu, WZ., Szczuka, M., Cercone, N.J., Ślȩzak, D. (eds) Rough Sets and Knowledge Technology. RSKT 2007. Lecture Notes in Computer Science(), vol 4481. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72458-2_39
Download citation
DOI: https://doi.org/10.1007/978-3-540-72458-2_39
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72457-5
Online ISBN: 978-3-540-72458-2
eBook Packages: Computer ScienceComputer Science (R0)