High Frequent Value Reduct in Very Large Databases

Lin, Tsau Young; Han, Jianchao

doi:10.1007/978-3-540-72530-5_41

Tsau Young Lin²⁴ &
Jianchao Han²⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4482))

Included in the following conference series:

International Workshop on Rough Sets, Fuzzy Sets, Data Mining, and Granular-Soft Computing

1515 Accesses
3 Citations

Abstract

One of the main contributions of rough set theory to data mining is data reduction. There are three reductions: attribute (column) reduction, row reduction, and value reduction. Row reduction is merging the duplicate rows. Attribute reduction is to find important attributes. Value reduction is to reduce the decision rules to a logically equivalent minimal length. Most recent attentions have been on finding attribute reducts. Traditionally, the value reduct has been searched through the attribute reduct. This paper observes that this method may miss the best value reducts. It also revisits an old rudiment idea [11], namely, a rough set theory on high frequency data: The notion of high frequency value reduct is extracted in a bottom-up fashion without finding attribute reducts. Our method can discover concise and important decision rules in large databases, and is described and illustrated by an example.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Agrawal, R., Imielinski, T., Swami, A.: Mining Association Rules between Sets of Items in Large Databases. In: Proc. of the ACM SIGMOD Conference, pp. 207–216 (1993)
Google Scholar
Agrawal, R., Srikant, R.: Fast algorithms for mining association rules. In: Proc. of the 20th International Conference on Very Large Data Bases, pp. 487–499. Morgan Kaufmann, San Francisco (1994)
Google Scholar
Chen, R., Lin, T.Y.: Supporting Rough Set Theory in Very Large Database Using ORACLE RDBMS. In: Soft Computing in Intelligent Systems and Information Processing, Proceedings of 1996 Asian Fuzzy Systems Symposium, Kenting, Taiwan, December 11-14, 1996, pp. 332–337 (1996)
Google Scholar
Chen, R., Lin, T.Y.: Finding Reducts in Very Large Databases. In: Proceedings of Joint Conference of Information Science, Research Triangle Park, North Carolina, March 1-5, 1997, pp. 350–352 (1997)
Google Scholar
Fernandez-Baizan, M., Ruiz, E., Wasilewska, A.: A Model of RSDM Implementation. In: Polkowski, L., Skowron, A. (eds.) RSCTC 1998. LNCS (LNAI), vol. 1424, pp. 186–193. Springer, Heidelberg (1998)
Chapter Google Scholar
Garcia-Molina, H., Ullman, J., Widom, J.: Database Systems: The Complete Book. Prentice-Hall, Englewood Cliffs (2001)
Google Scholar
Han, J., Hu, X., Lin, T.: A new computation model for rough set theory based on database systems. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds.) DaWaK 2003. LNCS, vol. 2737, pp. 381–390. Springer, Heidelberg (2003)
Chapter Google Scholar
Houtsma, M., Swami, A.: Set-Oriented Mining for Association Rules in Relational Databases. In: Proc. of Int. Conf. on Data Engineering, pp. 25–33 (1995)
Google Scholar
Hu, X., Lin, T., Han, J.: A new rough sets model based on database systems. J. of Fundamenta Informaticae 59(2-3), 135–152 (2004)
MathSciNet MATH Google Scholar
Lin, T.Y.: Neighborhood Systems and Approximation in Database and Knowledge Base Systems. In: Proceedings of the Fourth International Symposium on Methodologies of Intelligent Systems, Poster Session, October 12-15, 1989, pp. 75–86 (1989)
Google Scholar
Lin, T.Y.: Rough Set Theory in Very Large Database Mining. In: Symposium on Modeling, Analysis and Simulation, CESA’96 IMACS Multi Conference (Computational Engineering in Systems Applications), vol. 2, Lille, France, July 9-12, 1996, pp. 936–994 (1996)
Google Scholar
Pawlak, Z.: Rough Sets. International Journal of Information and Computer Science 11(5), 341–356 (1982)
Article MathSciNet MATH Google Scholar
Pawlak, Z.: Rough sets. Theoretical Aspects of Reasoning about Data. Kluwer Academic Publishers, Dordrecht (1991)
MATH Google Scholar
Skowron, A., Rauszer, C.: The discernibility matrices and functions in information systems. In: Slowinski, R. (ed.) Decision Support by Experience - Application of the Rough Sets Theory, pp. 331–362. Kluwer Academic Publishers, Dordrecht (1992)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, San Jose State University, San Jose, CA 95192, USA
Tsau Young Lin
Department of Computer Science, California State University Dominguez Hills, Carson, CA 90747, USA
Jianchao Han

Authors

Tsau Young Lin
View author publications
You can also search for this author in PubMed Google Scholar
Jianchao Han
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. of Computer Science and Engineering, York University, M3J 1P3, Toronto, Ontario, Canada
Aijun An
Institute of Computing Sciences, Poznań University of Technology, ul. Piotrowo 2, 60–965, Poznań, Poland
Jerzy Stefanowski
Department of Applied Computer Science, University of Winnipeg, R3B 2E9, Winnipeg, Manitoba, Canada
Sheela Ramanna
Department of Computer Science, University of Regina, S4S 0A2, Regina, Saskatchewan, Canada
Cory J. Butz
Department of Electrical and Computer Engineering, University of Alberta, T6G 2V4, Edmonton, Alberta, Canada
Witold Pedrycz
Institute of Compuer Science and Technology, Chongqing University of Posts and Telecommunications, 40065, Chongqing, P.R. China
Guoyin Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lin, T.Y., Han, J. (2007). High Frequent Value Reduct in Very Large Databases. In: An, A., Stefanowski, J., Ramanna, S., Butz, C.J., Pedrycz, W., Wang, G. (eds) Rough Sets, Fuzzy Sets, Data Mining and Granular Computing. RSFDGrC 2007. Lecture Notes in Computer Science(), vol 4482. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72530-5_41

Download citation

DOI: https://doi.org/10.1007/978-3-540-72530-5_41
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-72529-9
Online ISBN: 978-3-540-72530-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics