Knowledge Discovery Using Rough Set Theory

Caballero, Yaile; Bello, Rafael; Arco, Leticia; García, Maria; Ramentol, Enislay

doi:10.1007/978-3-642-05177-7_18

Yaile Caballero⁵,
Rafael Bello⁶,
Leticia Arco⁵,
Maria García⁶ &
…
Enislay Ramentol⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 262))

2222 Accesses
2 Citations

Abstract

Rough Set Theory (RST) opened a new direction in the development of incomplete information theories and is a powerful data analysis tool. In this investigation, the possibility of using this theory to generate a priori knowledge about a dataset is demonstrated. A proposal is developed for previous characterization of training sets, using RST estimation measurements. This characterization offers an assessment of the quality of data in order to use them as a training set in machine learning techniques. The proposal has been experimentally studied using international databases and some known classifiers such as MLP, C4.5 and K-NN, and satisfactory results have been obtained.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ruiz, R.: Heurísticas de selección de atributos para datos de gran dimensionalidad. Departamento de Lenguajes y Sistemas Informáicos. Universidad de Sevilla, Sevilla (2006)
Google Scholar
Rosemblatt, F.: Principles of Neurodynamics, New York (1962)
Google Scholar
Cover, T.M., Hart, P.E.: Nearest neighbour pattern classification. Institute of Electronical and Electronics Engineers Transactions on Information Theory 13, 21–27 (1967)
MATH Google Scholar
Quinlan, J.R.: C-4.5: Programs for machine learning, San Mateo, California (1993)
Google Scholar
Pawlak, Z.: Rough Sets. International Journal of Computer and Information Sciences 11, 341–356 (1982)
Article MATH MathSciNet Google Scholar
Komorowski, J., Pawlak, Z.: Rough Sets: A tutorial. Rough Fuzzy Hybridization: A new trend in decision-making, pp. 3–98. Springer, Heidelberg (1999)
Google Scholar
Greco, S.: Rough sets theory for multicriteria decision analysis. European Journal of Operational Research 129, 1–47 (2001)
Article MATH MathSciNet Google Scholar
Pal, S.K.: Web mining in Soft Computing framework: Relevance, State of the art and Future Directions. IEEE Transactions on Neural Networks (2002)
Google Scholar
Segovia, M.J.: Predicción de insolvencias con el método Rough Set. Universidad Complutense de Madrid, España (2003)
Google Scholar
Tay, F.E., Shen, L.: Fault diagnosis based on Rough Set Theory. Engineering Applications of Artificial Intelligence 16, 39–43 (2003)
Article Google Scholar
Yao, Y.Y.: On Generalizing Rough Set Theory. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Google Scholar
Caballero, Y.: Rough Set Theory Measures to Knowledge Generation. In: Proceedings of Seven International Conference on Intelligent Systems Design and Applications, ISDA2007, Rio de Janeiro, Brazil. IEEE Computer Society, Los Alamitos (2007); Order Number P2976. Library of Congress Numbrer 2007930106, ISBN 0-7695-2976-3
Google Scholar
Mitra, S.: Computational Intelligence in Bioinformatics. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 134–152. Springer, Heidelberg (2005)
Google Scholar
Bello, R., Puris, A., Nowe, A., Martínez, Y., García, M.M.: Two step ant colony system to solve the feature selection problem. In: Martínez-Trinidad, J.F., Carrasco Ochoa, J.A., Kittler, J. (eds.) CIARP 2006. LNCS, vol. 4225, pp. 588–596. Springer, Heidelberg (2006)
Chapter Google Scholar
Chin, K.S., Liang, J., Dang, C.: Rough Set Data Analysis Algorithms for Incomplete Information Systems. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Chapter Google Scholar
Peters, J.F.: Rough Ethology: Towards a Biologically-Inspired Study of Collective Behavior in Intelligent Systems with Approximation Spaces. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 153–174. Springer, Heidelberg (2005)
Google Scholar
Skowron, A., Świniarski, R.W., Synak, P.: Approximation Spaces and Information Granulation. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 175–189. Springer, Heidelberg (2005)
Google Scholar
Ślęzak, D.: Rough Sets and Bayes Factor. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 202–229. Springer, Heidelberg (2005)
Google Scholar
Wolski, M.: Formal Concept Analysis and Rough Set Theory from the Perspective of Finite Topological Approximations. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 230–243. Springer, Heidelberg (2005)
Google Scholar
Koczkodaj, W.W.: Myths about Rough Set Theory. ACM 41 (1998)
Google Scholar
Ohrn, A., Komorowski, J., Skowron, A., Synak, P.: The Design and Implementation of a Knowledge Discovery Toolkit Based on Rough Sets. In: Pulkowski, Skorn (eds.) The ROSETTA System. Rough Sets in Knowledge discovery 1: Methodology and Applications. Studies in Fuzziness and Soft Computing, vol. 18, pp. 376–399 (1998)
Google Scholar
Lee, S., Propes, N., Zhang, G., Zhao, Y., Vachtsevanos, G.: Rough Set Feature Selection and Diagnostic Rule Generation for Industrial Applications. In: Alpigini, J.J., Peters, J.F., Skowron, A., Zhong, N. (eds.) RSCTC 2002. LNCS (LNAI), vol. 2475, p. 568. Springer, Heidelberg (2002)
Chapter Google Scholar
Tsumoto, S.: Automated extraction of hierarchical decision rules from clinical databases using rough set model. Expert systems with Applications 24, 189–197 (2003)
Article Google Scholar
Grzymala-Busse, J.W., Siddhaye, S.: Rough set approaches to rule induction from incomplete data. In: 10th International Conference on Information Processing and Management of Uncertainty in Knowledge-Bases systems IPMU 2004, Perugia, Italy, vol. 2, pp. 923–930 (2004)
Google Scholar
Pawlak, Z.: Rough Sets. Comm. of ACM 38 (1995)
Google Scholar
Yao, Y.S., Wong, C.: Methodologies for Knowledge Discovery and Data Mining. In: Zhou, Z.a. (eds.) On Information-Theoretic Measures of attribute importance, pp. 231–238 (1999)
Google Scholar
Dunstsh, I., Gunter, G.: Rough set data analysis (2000)
Google Scholar
Zhong, N., Dong, J., Ohsuga, S.: Using Rough sets with heuristics for feature selection. Journal of Intelligent Information Systems 16, 199–214 (2001)
Article MATH Google Scholar
Kierczak, M., Rudnicki, W.R., Komorowski, J.: Construction of rough sets-based classifiers for predicting HIV resistance to nucleoside reverse transcriptase inhibitors. In: The International Symposium on Fuzzy and Rough Sets, ISFUROS 2006, Santa Clara, Cuba (2006)
Google Scholar
Revett, K., Gorunesco, F., Gorunesco, M.: A Rough Sets based investigation of a Beta-Carotene/Retinol dataset. In: The International Symposium on Fuzzy and Rough Sets, ISFUROS2006, Santa Clara, Cuba (2006)
Google Scholar
Caballero, Y., Bello, R., Salgado, Y., Márquez, Y., León, P., Alvarez, D., Zaldívar, J.M.: La Teoría de los Conjuntos Aproximados en el mejoramiento de los conjuntos de entrenamiento en Bioinformática. In: II Congreso Internacional de Bioinformática y Neuroinformática. Informática 2007, La Habana, Cuba (2007)
Google Scholar
Pal, S.K., Mitra, P.: Rough Sets, EM Algorithm, MST and Multispectral Image Segmentation. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Chapter Google Scholar
Hu, X.T., Lin, T.Y., Han, J.: A New Rough Sets Model Based on Database Systems. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Chapter Google Scholar
Zheng, Z., Wang, G., Wu, Y.: A Rough Set and Rule Tree Based Incremental Knowledge Acquisition Algorithm. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Google Scholar
Bazan, J.G., Szczuka, M.: The Rough Set Exploration System. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 37–56. Springer, Heidelberg (2005)
Google Scholar
Hor, C.-L., Crossley, P.A.: Knowledge Extraction from Intelligent Electronic Devices. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 82–111. Springer, Heidelberg (2005)
Google Scholar
Kostek, B., Szczuko, P., Żwan, P., Dalka, P.: Processing of Musical Data Employing Rough Sets and Artificial Neural Networks. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 112–133. Springer, Heidelberg (2005)
Google Scholar
Suraj, Z., Grochowalski, P.: The Rough Set Database System: An Overview. In: Peters, J.F., Skowron, A. (eds.) Transactions on Rough Sets III. LNCS, vol. 3400, pp. 190–201. Springer, Heidelberg (2005)
Google Scholar
Choubey, S.K.: A comparison of feature selection algorithms in the context of rough classifiers. In: Fifth IEEE International Conference on Fuzzy Systems, vol. 2, pp. 1122–1128 (1996)
Google Scholar
Chouchoulas, A., Shen, Q.: A rough set-based approach to text classification. LNAI, vol. 11, pp. 118–127. Springer, Heidelberg (1999)
Google Scholar
Piñero, P., Arco, L., García, M.M., Caballero, Y., Yzquierdo, R., Morales, A.: Two New Metrics for Feature Selection in Pattern Recognition. In: Sanfeliu, A., Ruiz-Shulcloper, J. (eds.) CIARP 2003. LNCS, vol. 2905, pp. 488–497. Springer, Heidelberg (2003)
Google Scholar
Sugihara, K., Tanaka, H.: Rough Sets approach to information systems with interval decision values in evaluation problems. In: The International Symposium on Fuzzy and Rough Sets, ISFUROS 2006, Santa Clara, Cuba (2006)
Google Scholar
Midelfart, H., Komorowski, J., Ñorsett, K., Yadetie, F., Sandvik, A., Laegreid, A.: Learning rough set classifiers from gene expression and clinical data. Fundamenta Informaticae 53, 155–183 (2003)
MATH Google Scholar
Miao, D., Hou, L.: An Application of Rough Sets to Monk’s Problems Solving. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Chapter Google Scholar
Zhao, Y., Zhang, H., Pan, Q.: Classification Using the Variable Precision Rough Set. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Google Scholar
Greco, S., Inuiguchi, M., Slowinski, R.: Rough Sets and Gradual Decision Rules. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Chapter Google Scholar
Bosc, P., Prade, H.: An introduction to fuzzy set and possibility theory based approaches to the treatment of uncertainty and imprecision in database management system. In: Proc. of Second Workshop Uncertainty management, Information Systems: from Needs to Solution, California (1993)
Google Scholar
Parsons, S.: Current approaches to handling imperfect information in data and knowledges bases. IEEE Trans. on Knowledge and Data Engineering 8 (1996)
Google Scholar
Grabowski, A.: Basic Properties of Rough Sets and Rough Membership Function. Journal of Formalized Mathematics 15 (2003)
Google Scholar
Grzymala-Busse, J.W.: Managing uncertainty in machine learning from examples. In: Proceedings of the Workshop Intelligent Information System III, Polonia (1994)
Google Scholar
Pal, S.K., Skowron, A.: Rough Fuzzy Hybridization: A New Trend in Decision-Making (1999)
Google Scholar
Orlowska, E. (ed.): Incomplete Information. Rough sets analysis. Physica-Verlag (1998)
Google Scholar
Skowron, A., Peters, J.F.: Rough Sets: Trends and Challenges. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Google Scholar
Bazan, J., Son, N.H., Skowron, A., Szczuka, M.: A View on Rough Set Concept Approximations. In: Wang, G., Liu, Q., Yao, Y., Skowron, A. (eds.) RSFDGrC 2003. LNCS (LNAI), vol. 2639. Springer, Heidelberg (2003)
Chapter Google Scholar
Slowinski, R., Vanderpooten, D.: Similarity relation as a basis for rough approximations. In: Advances in Machine Intelligence & Soft-Computing, vol. IV, pp. 17–33 (1997)
Google Scholar
Wilson, D.R., Martínez, T.R.: Improved Heterogeneous Distance Functions. Journal of Artificial Intelligence Research 6, 1–34 (1997)
MATH MathSciNet Google Scholar
García, J.M.: KNN Workshop. Suite para el Desarrollo de Clasificadores Basados en Instancias. Departamento Computación. Facultad de Matemática, Física y Computación. Universidad Central “Marta Abreu” de Las Villas (2003)
Google Scholar
Skowron, A., Stepaniuk, J.: Intelligent systems based on rough set approach. In: International Workshop Rough Sets. State of the Art and Perspectives, pp. 62–64 (1992)
Google Scholar
Deogun, J.S.: Exploiting upper approximations in the rough set methodology. In: Fayyad, U.Y.U. (ed.) First International Conference on Knowledge Discovery and Data Mining, Canada, pp. 69–74 (1995)
Google Scholar
Kohavi, R., Frasca, B.: Useful Feature Subsets and Rough Set Reducts. In: Third International Workshop on Rough Sets and Soft Computing (1994)
Google Scholar
Carlin, U.S.: Rough set analysis of medical datasets and A case of patient with suspected acute appendicitis. In: ECAI 1998 Workshop on Intelligent data analysis in medicine and pharmacology (1998)
Google Scholar
Ahn, B.S.: The integrated methodology of rough set theory and artificial neural networks for business failure predictions (2000)
Google Scholar
Lazo, M., Ruiz, J., Alba, E.: An overview of the evolution of the concept of testor. Pattern Recognition, 753–762 (2001)
Google Scholar
Santiesteban, Y., Pons, A.: LEX: un nuevo algoritmo para el cálculo de los testores típicos. Revista Ciencias Matemáticas 21 (2003)
Google Scholar
Skowron, A., et al. (eds.): RSFDGrC 1999. LNCS (LNAI), vol. 1711. Springer, Heidelberg (1999)
MATH Google Scholar
Arco, L., Bello, R., García, M.: On clustering validity measures and the Rough Set Theory. 5th Mexican International Conference on Artificial Intelligence. IEEE Computer Society Press (2006)
Google Scholar
Caballero, Y., Arco, L., Bello, R., Marx, J.: New Measures for Evaluating Decision Systems using Rough Set Theory: The Application in Seasonal Weather Forecasting. In: Marx, J., Sonnenschein, M., Müller, M., Welsch, H., Rautenstrauch, C. (eds.) Third International ICSC Symposium on Information Technologies in Environmental Engineering (ITEE 2007), pp. 161–174. Springer Verlag, Carl von Ossietzky Universität Oldenburg, Heidelberg (2007)
Google Scholar
Witten, I., Frank, E.: Transformation: Engineering the input and output. In: Witten, I., Frank, E. (eds.) Data Mining. Practical Machine Learning Tools and Techniques, pp. 296–304 (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Camagüey, Camagüey, Cuba
Yaile Caballero, Leticia Arco & Enislay Ramentol
Department of Computer Science, Universidad Central de Las Villas, Santa Clara, Cuba
Rafael Bello & Maria García

Authors

Yaile Caballero
View author publications
You can also search for this author in PubMed Google Scholar
Rafael Bello
View author publications
You can also search for this author in PubMed Google Scholar
Leticia Arco
View author publications
You can also search for this author in PubMed Google Scholar
Maria García
View author publications
You can also search for this author in PubMed Google Scholar
Enislay Ramentol
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Computer Science, Polish Academy of Sciences, ul.Ordona 21, 01-237, Warsaw, Poland
Jacek Koronacki & Sławomir T. Wierzchoń &
Woodward Hall 430C University of North Carolina, 9201 University City Blvd., N.C. 28223, Charlotte, USA
Zbigniew W. Raś
Systems Research Institute, Polish Academy of Sciences, ul.Newelska 6, 01-447, Warsaw, 01-447
Janusz Kacprzyk

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Caballero, Y., Bello, R., Arco, L., García, M., Ramentol, E. (2010). Knowledge Discovery Using Rough Set Theory. In: Koronacki, J., Raś, Z.W., Wierzchoń, S.T., Kacprzyk, J. (eds) Advances in Machine Learning I. Studies in Computational Intelligence, vol 262. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-05177-7_18

Download citation

DOI: https://doi.org/10.1007/978-3-642-05177-7_18
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-05176-0
Online ISBN: 978-3-642-05177-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics