ACME: An Associative Classifier Based on Maximum Entropy Principle

Thonangi, Risi; Pudi, Vikram

doi:10.1007/11564089_11

Risi Thonangi²¹ &
Vikram Pudi²¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3734))

Included in the following conference series:

International Conference on Algorithmic Learning Theory

2111 Accesses
11 Citations

Abstract

Recent studies in classification have proposed ways of exploiting the association rule mining paradigm. These studies have performed extensive experiments to show their techniques to be both efficient and accurate. However, existing studies in this paradigm either do not provide any theoretical justification behind their approaches or assume independence between some parameters. In this work, we propose a new classifier based on association rule mining. Our classifier rests on the maximum entropy principle for its statistical basis and does not assume any independence not inferred from the given dataset. We use the classical generalized iterative scaling algorithm (GIS) to create our classification model. We show that GIS fails in some cases when itemsets are used as features and provide modifications to rectify this problem. We show that this modified GIS runs much faster than the original GIS. We also describe techniques to make GIS tractable for large feature spaces – we provide a new technique to divide a feature space into independent clusters each of which can be handled separately. Our experimental results show that our classifier is generally more accurate than the existing classification methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Beeferman, D., Berger, A., Lafferty, J.: Statistical models for text segmentation. Machine Learning 34(1-3), 177–210 (1999)
Article MATH Google Scholar
Boutilier, C., Friedman, N., Goldszmidt, M., Koller, D.: Context-specific independence in bayesian-networks. In: Uncertainty in Artificial Intelligence(UAI) (1996)
Google Scholar
Clark, P., Niblett, T.: The cn2 induction algorithm. Machine Learning 2, 261–283 (1989)
Google Scholar
Clark, P., Niblett, T.: Bayesian network classifiers. Machine Learning 29, 131–163 (1997)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines. Cambridge University Press, Cambridge (2000)
Google Scholar
Darroch, J., Ratcliff, D.: Generalized iterative scaling for log-linear models. Annals of Mathematical Statistics 43, 1470–1480 (1972)
Article MATH MathSciNet Google Scholar
Dong, G., Zhang, X., Wong, L., Li, J.: Classification by aggregating emerging patterns. In: Discovery Science (December 1999)
Google Scholar
Duda, R., Hart, P.: Pattern Classification and Scene Analysis. John Wiley & Sons, Chichester (1973)
MATH Google Scholar
Fayyad, U.M., Irani, K.B.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Intl. Joint Conf. on Artificial Intelligence(IJCAI), pp. 1022–1029 (1993)
Google Scholar
Good, I.: Maximum entropy for hypothesis formulation, especially for multidimensional contingency tables. Annals of Mathematical Statistics 34, 911–934 (1963)
Article MATH MathSciNet Google Scholar
Kononenko, I.: Semi-naive bayesian classifier. In: European Working Session on Learnign, pp. 206–219 (1991)
Google Scholar
Langley, P., Sage, S.: Induction of selective-bayesian classifiers. In: Uncertainty in Artificial Intelligence(UAI), pp. 399–406 (1994)
Google Scholar
Lau, R.: Adaptive statistical language modeling. Master’s thesis, Massachusetts Institute of Technology, Cambridge, MA (1994)
Google Scholar
Li, W., Han, J., Pei, J.: CMAR: Accurate and efficient classification based on multiple class-association rules. In: ICDM (2001)
Google Scholar
Lim, T.-S., Loh, W.-Y., Shih, Y.-S.: A comparison of prediction accuracy, complexity, and training time of thirty-three old and new classification algorithms. Machine Learning 40(3), 203–228 (2000)
Article MATH Google Scholar
Liu, B., Hsu, W., Ma, Y.: Integrating classification and association rule mining. In: Proc. of 4th Intl. Conf. on Knowledge Discovery and Data Mining, KDD (August 1998)
Google Scholar
Meretakis, D., Lu, H., Wuthrich, B.: A study on the performance of large bayes classifier. In: Lopez de Mantaras, R., Plaza, E. (eds.) ECML 2000. LNCS (LNAI), vol. 1810, pp. 271–279. Springer, Heidelberg (2000)
Chapter Google Scholar
Meretakis, D., Wuthrich, B.: Extending naive-bayes classifiers using long itemsets. In: KDD, pp. 165–174 (1999)
Google Scholar
Merz, C., Murphy, P.: UCI repository of machine learning databases (1996), http://cs.uci.edu/~mlearn/MLRepository.html
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Ratnaparkhi, A.: A simple introduction to maximum entropy models for natural language processing. Technical Report IRCS Report 97-98, Institute for Research in Cognitive Science, University of Pennsylvania (May 1997)
Google Scholar
Ratnaparkhi, A.: Maximum Entropy Models for Natural Language Ambiguity Resolution. PhD thesis, Institute for Research in Cognitive Science, University of Pennsylvania (1998)
Google Scholar
Rosenfeld, R.: Adaptive Statistical Language Modeling: A Maximum Entropy Approach. PhD thesis, Carnegie Mellon University (1994)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical machine learning tools and techniques, 2nd edn. Morgan Kaufmann, San Francisco (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Center for Data Engineering, International Institute of Information Technology, Hyderabad
Risi Thonangi & Vikram Pudi

Authors

Risi Thonangi
View author publications
You can also search for this author in PubMed Google Scholar
Vikram Pudi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computing, National University of Singapore, 117590, Singapore
Sanjay Jain
Ruhr-Universität Bochum, Germany
Hans Ulrich Simon
Department of Information and Communication Engineering, Faculty of Electro-Communications, The University of Electro-Communications, Chofugaoka 1–5–1, Chofu, 182-8585, Tokyo, Japan
Etsuji Tomita

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Thonangi, R., Pudi, V. (2005). ACME: An Associative Classifier Based on Maximum Entropy Principle. In: Jain, S., Simon, H.U., Tomita, E. (eds) Algorithmic Learning Theory. ALT 2005. Lecture Notes in Computer Science(), vol 3734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564089_11

Download citation

DOI: https://doi.org/10.1007/11564089_11
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29242-5
Online ISBN: 978-3-540-31696-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics