Treatment of missing values for association rules

Ragel, Arnaud; Crémilleux, Bruno

doi:10.1007/3-540-64383-4_22

Arnaud Ragel⁹ &
Bruno Crémilleux⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 1394))

Included in the following conference series:

Pacific-Asia Conference on Knowledge Discovery and Data Mining

1703 Accesses
29 Citations

Abstract

Agrawal et at. [2] have proposed a fast algorithm to explore very large transaction databases with association rules [l]. In many real world applications data are managed in relational databases where missing values are often inevitable. We will show, in this case, that association rules algorithms give bad results because they have been developed for transaction databases where practically the problem of missing values does not exist. In this paper, we propose a new approach to mine association rules in relational databases containing missing values. The main idea is to cut a database in several valid databases (vdb) for a rule, a vdb must not have any missing values. We redefine support and confidence of rules for vdb. These definitions are fully compatible with

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

R. Agrawal, T. Imielinski, and A. Swami. Mining association rules between sets of items in large databases. In Proc. of the ACM SIGMOD Conference on Management of Data, Washington, D.C., p 207–216, May 1993.
Google Scholar
R. Agrawal, H. Mannila, R. Srikant, H. Toivonen and A. l. Verkamo. Fast Discovery of Association Rules. In Advances in Knowledge Discovery and Data Mining, Chapter 12, AAAI/MIT Press, 1996.
Google Scholar
R. Agrawal, R. Srikant. Fast algorithms for mining association rules in large databases. In Proc. of the Twentieth Int'l Conference on Very Large Databases (VLDB'94), p. 487–499, September 1994.
Google Scholar
K. Ali, S. Manganaris, R. Srikant: Partial Classification using Association Rules, in Proc. of the 3rd Int'l Conference on Knowledge Discovery in Databases and Data Mining, Newport Beach, California, August 1997.
Google Scholar
L. Breiman, J.H Friedman, R.A Olshen, CJ Stone. Classification and Regression Trees, Wadsworth Int'l Group, Belmont, CA, The Wadsworth Statistics/ Probability Series, 1984.
Google Scholar
G. Celeux. Le traitement des données manquantes dans le logiciel SICLA. Technical reports number 102. INRIA, France, December 1988.
Google Scholar
B. Crémilleux, C. Robert. A pruning method for decision trees in uncertain domains: applications in medicine. Int'l Workshop on Intelligent Data Analysis in Medicine and Pharmacology, ECAI 1996, p 15–20, Budapest 1996.
Google Scholar
B. Crémilleux, C. Robert. A theorical framework for decision trees in uncertain domains: application to medical data sets. 6th Conference in Artificial Intelligence in Medicine Europe (AIME 97), p 145–156, Grenoble 1997.
Google Scholar
W.Z Liu, A.P White, S.G Thompson and M.A Bramer. Techniques for Dealing with Missing Values in Classification. In Second Int'l Symposium on Intelligent Data Analysis, Birkbeck College, University of London, 4th-5th August 1997.
Google Scholar
H. Mannila, H. Toivonen and A. Inkeri Verkamo. Efficient algorithms for discovering association rules. In Knowledge Discovery in Databases, Papers from the 1994 AAAI Workshop (KDD'94), p. 181–192, Seattle, Washington, July 1994.
Google Scholar
J.R Quinlan. Induction of decision trees. Machine learning, 1, p. 81–106, 1986.
Google Scholar
J.R Quinlan. Unknown Attribute Values in Induction, in Segre A.M.(ed.), Proc. of the Sixth Int'l Workshop on Machine Learning, Morgan Kaufmann, Los Altos, CA, p. 164–168, 1989.
Chapter Google Scholar
J.R Quinlan. C4.5: Programs for Machine Learning, Morgan Kaufmann, San Mateo, CA, 1993.
Google Scholar
A. Ragel: Traitement des valeurs manquantes dans les arbres de décision. Technical reports, Les cahiers du GREYC. University of Caen, France, 1997.
Google Scholar
A. Savasere, E. Omiecinski, S. Navathe. An efficient algorithm for mining association rules in large databases. Proc. of the 21st Int. Conference on Very Large Databases (VLDB'95), p. 432–444, Zurich, Switzerland, 1995.
Google Scholar
H. Toivonen. Sampling large databases for association rules. In Proc. of the 22nd Int'l Conference on Very Large Databases (VLDB'96), p. 134–145, Bombay, India, 1996
Google Scholar

Download references

Author information

Authors and Affiliations

GREYC, CNRS UPRESA 6072, Université de Caen, F14032, Caen Cedex, France
Arnaud Ragel & Bruno Crémilleux

Authors

Arnaud Ragel
View author publications
You can also search for this author in PubMed Google Scholar
Bruno Crémilleux
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science and Software Engineering, Monash university, 900 Dandenong Road, Caulfield East, Victoria, 3145, Australia
Xindong Wu
Department of Computer Science, The University of Melbourne, Parkville, Victoria, 3052, Australia
Ramamohanarao Kotagiri
School of Computer Science and Engineering, Monash university, Clayton, Victoria, 3168, Australia
Kevin B. Korb

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ragel, A., Crémilleux, B. (1998). Treatment of missing values for association rules. In: Wu, X., Kotagiri, R., Korb, K.B. (eds) Research and Development in Knowledge Discovery and Data Mining. PAKDD 1998. Lecture Notes in Computer Science, vol 1394. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-64383-4_22

Download citation

DOI: https://doi.org/10.1007/3-540-64383-4_22
Published: 25 August 2005
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-64383-8
Online ISBN: 978-3-540-69768-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics