Abstract
In this paper we show experimental results on the MLEM2 rule induction algorithm and the Multiple Scanning discretization algorithm. The MLEM2 algorithm of rule induction has its own mechanisms to handle missing attribute values and numerical data. We compare, in terms of an error rate, two setups: MLEM2 used for rule induction directly from incomplete and numerical data and MLEM2 inducing rule sets from data sets previously discretized by Multiple Scanning and then converted to be incomplete. In both setups certain and possible rule sets were induced. For certain rule sets, the former setup was more successful for two data sets, while the latter setup was more successful for four data sets, for eight data sets the difference was not significant (Wilcoxon test, 5% significance level). Similarly, for possible rule sets the former setup was more successful for two data sets, while the latter setup was more successful for three data sets. Thus we may conclude that there is not significant difference between both setups and that we may use MLEM2 for rule induction directly from incomplete and numerical data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Blajdo, P., Grzymala-Busse, J.W., Hippe, Z.S., Knap, M., Mroczek, T., Piatek, L.: A comparison of six approaches to discretization—a rough set perspective. In: Proceedings of the Rough Sets and Knowledge Technology Conference, pp. 31–38 (2008)
Chan, C.C., Batur, C., Srinivasan, A.: Determination of quantization intervals in rule based model for dynamic. In: Proceedings of the IEEE Conference on Systems, Man, and Cybernetics, pp. 1719–1723 (1991)
Chmielewski, M.R., Grzymala-Busse, J.W.: Global discretization of continuous attributes as preprocessing for machine learning. Int. J. Approximate Reasoning 15(4), 319–331 (1996)
Clarke, E.J., Barton, B.A.: Entropy and MDL discretization of continuous variables for bayesian belief networks. Int. J. Intell. Syst. 15, 61–92 (2000)
Elomaa, T., Rousu, J.: General and efficient multisplitting of numerical attributes. Mach. Learn. 36, 201–244 (1999)
Elomaa, T., Rousu, J.: Efficient multisplitting revisited: optima-preserving elimination of partition candidates. Data Min. Knowl. Disc. 8, 97–126 (2004)
Fayyad, U.M., Irani, K.B.: On the handling of continuous-valued attributes in decision tree generation. Mach. Learn. 8, 87–102 (1992)
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Grzymala-Busse, J.W.: A new version of the rule induction system LERS. Fundamenta Informaticae 31, 27–39 (1997)
Grzymala-Busse, J.W.: MLEM2—discretization during rule induction. In: Proceedings of the International Conference on Intelligent Information Processing and WEB Mining Systems, pp. 499–508 (2003)
Grzymala-Busse, J.W.: Rough set strategies to data with missing attribute values. In: Notes of the Workshop on Foundations and New Directions of Data Mining, in Conjunction with the Third International Conference on Data Mining, pp. 56–63 (2003)
Grzymala-Busse, J.W.: Characteristic relations for incomplete data: a generalization of the indiscernibility relation. In: Proceedings of the Fourth International Conference on Rough Sets and Current Trends in Computing, pp. 244–253 (2004)
Grzymala-Busse, J.W.: Data with missing attribute values: generalization of indiscernibility relation and rule induction. Trans. Rough Sets 1, 78–95 (2004)
Grzymala-Busse, J.W.: Three approaches to missing attribute values—a rough set perspective. In: Proceedings of the Workshop on Foundation of Data Mining, in Conjunction with the Fourth IEEE International Conference on Data Mining, pp. 55–62 (2004)
Grzymala-Busse, J.W.: A multiple scanning strategy for entropy based discretization. In: Proceedings of the 18th International Symposium on Methodologies for Intelligent Systems, pp. 25–34 (2009)
Grzymala-Busse, J.W.: Discretization based on entropy and multiple scanning. Entropy 15, 1486–1502 (2013)
Grzymala-Busse, J.W., Mroczek, T.: A comparison of two approaches to discretization: multiple scanning and c4.5. In: Proceedings of the 6th International Conference on Pattern Recognition and Machine Learning, pp. 44–53 (2015)
Grzymala-Busse, J.W., Mroczek, T.: A comparison of four approaches to discretization based on entropy. Entropy 18, 1–11 (2016)
Kohavi, R., Sahami, M.: Error-based and entropy-based discretization of continuous features. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 114–119 (1996)
Kotsiantis, S., Kanellopoulos, D.: Discretization techniques: a recent survey. GESTS Int. Trans. Comput. Sci. Eng. 32(1), 47–58 (2006)
Liu, H., Hussain, F., Tan, C.L., Dash, M.: Discretization: an enabling technique. Data Min. Knowl. Disc. 6, 393–423 (2002)
Nguyen, H.S., Nguyen, S.H.: Discretization methods in data mining. In: Polkowski, L., Skowron, A. (eds.) Rough Sets in Knowledge Discovery 1: Methodology and Applications, pp. 451–482. Physica-Verlag, Heidelberg (1998)
Pawlak, Z.: Rough sets. Int. J. Comput. Inform. Sci. 11, 341–356 (1982)
Pawlak, Z.: Rough Sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
Pawlak, Z., Grzymala-Busse, J.W., Slowinski, R., Ziarko, W.: Rough sets. Commun. ACM 38, 89–95 (1995)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers, San Mateo (1993)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Clark, P.G., Gao, C., Grzymala-Busse, J.W. (2018). MLEM2 Rule Induction Algorithm with Multiple Scanning Discretization. In: Czarnowski, I., Howlett, R., Jain, L. (eds) Intelligent Decision Technologies 2017. IDT 2017. Smart Innovation, Systems and Technologies, vol 72. Springer, Cham. https://doi.org/10.1007/978-3-319-59421-7_20
Download citation
DOI: https://doi.org/10.1007/978-3-319-59421-7_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-59420-0
Online ISBN: 978-3-319-59421-7
eBook Packages: EngineeringEngineering (R0)