Knowledge Discovery in Textual Databases: A Concept-Association Mining Approach

Mete, Mutlu; Yuruk, Nurcan; Xu, Xiaowei; Berleant, Daniel

doi:10.1007/978-1-4419-0176-7_11

Mutlu Mete⁴,
Nurcan Yuruk⁵,
Xiaowei Xu⁶ &
…
Daniel Berleant⁵

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 132))

2849 Accesses

Abstract

The number of scientific publications is exploding as online digital libraries and the World Wide Web grow. MEDLINE, the premier bibliographic database of the National Library of Medicine (NLM) , contains about 18 million records from more than 7,300 different publications dating from 1965; it is growing by about 400,000 citations each year. The explosive growth of information in textual documents creates great need for techniques for knowledge discovery from text collections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Agrawal R and Srikant R (1994) Fast Algorithms for Mining Association Rules. 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp 487-499.
Google Scholar
Agrawal R, Imielinski T, and Swami A (1993) Mining Association Rules Between Sets of Items in Large Database. ACM SIGMOD Conference, pp. 207-216.
Google Scholar
Beil F, Ester M, and Xu X (2002) Frequent Term-Based Text Clustering. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp 436-442.
Google Scholar
BioMed Central text corpus, http://www.biomedcentral.com/info/about/datamining/.
Bodon F (2003) A Fast APRIORI Implementation. IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, pp 56-65.
Google Scholar
Brin S, Motwani R, and Silverstein C (1997) Beyond Market Basket: Generalizing Association Rules to Correlations. ACM SIGMOD Conference, Tucson, Arizona, pp. 265-276.
Google Scholar
Brin S, Motwani R, Ullman J, and Tsur, S (1997) Dynamic Itemset Counting and Implication Rules for Market Basket Data ACM SIGMOD Conference, Tucson, Arizona, pp. 255-264.
Google Scholar
College of Biological Sciences, http://www.biosci.ohio-state.edu/∼parasite/plasmodium.html.
Google Scholar
Fayyad U, Piatetsky-Shapiro G, and Smyth P (1996) Knowledge Discovery and Data Mining: Towards a Unifying Framework, Second Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, pp. 82-88.
Google Scholar
Feldman R and Dagan I (1995) Knowledge Discovery in Textual Databases (KDT). First international conference on knowledge discovery (KDD'95), Montreal, pp 112-117.
Google Scholar
Feldman R, and Hirsh H (1996) Mining Associations In Text In The Presence Of Background Knowledge. 2nd International Conference on Knowledge Discovery and Data Mining, pp. 343-346.
Google Scholar
Feldman R, Dagan I, and Hirsh H (1998) Mining Text Using Keyword Distributions. Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, 10(3), pp. 281-300.
Google Scholar
Health Information Main Page, http://www.niams.nih.gov/hi/topics/psoriasis/psoriafs.htm.
Hearst MA (1999) Untangling Text Data Mining. 37th Annual Meeting of the Association for Computational Linguistics (ACL'99), University of Maryland, pp 3-10.
Google Scholar
Hipp J, Guntzer U, and Nakhaeizadeh G (2000) Algorithms for Association Rule Mining – A General Survey and Comparison, ACM SIGKDD Explorations, Vol.2, pp. 58-64.
Article Google Scholar
Hospital for Special Surgery, Orthopedic Surgery, http://orthopaedics.hss.edu/services/conditions/hip/dv_thrombosis.asp.
Information For Health Professionals http://allhat.sph.uth.tmc.edu.
Krovetz R (1993) Viewing Morphology as an Inference Process. 16th ACM SIGIR Conference, Pittsburgh, pp 191-202.
Google Scholar
Lewis D (1992) An Evaluation Of Phrasal And Clustered Representations On A Text Categorization Problem. ACM-SIGIR Conference on Information Retrieval, Copenhagen, Denmark, pp 37-50.
Google Scholar
Lin SH, Shih CS, and Chen MC (1998) Extracting Classification Knowledge of Internet Documents with Mining Term Associations: A Semantic Approach. 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp 241-249.
Google Scholar
Liu B, Hsu W, and Ma YM (1998) Integrating Classification and Association Rule Mining. The Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), New York City, pp. 80-86.
Google Scholar
Liu B, Hsu W, and Ma YM (1997) Mining Association Rules with Multiple Minimum Supports. The Fourth International Conference on Knowledge Discovery and Data Mining, pp. 337-341.
Google Scholar
Loh S, Wives LK, and Oliveia JPM (2000) Concept Based Knowledge Discovery from Texts Extracted from the Web. ACM SIGKDD Explorations, vol. 2, pp. 29-40.
Article Google Scholar
Morishita S and Sese J (2000) Traversing Itemset Lattices with Statistical Metric Pruning. In Proc. of the 19th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems,. ACM Press, pp. 226-236.
Google Scholar
National Center for Biotechnology Information-A, retrieved from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=9662402.
National Center for Biotechnology Information-B, retrieved from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12150501&dopt=Abstract.
Omiecinski E (2003) Alternative Interest Measures for Mining Associations. IEEE Trans. Knowledge and Data Engineering, 15(1), pp. 57-69.
Article MathSciNet Google Scholar
PubMed Central, retrieved from http://www.pubmedcentral.nih.gov/.
Reuters-21578 Text Categorization Text Collection retrieved from http://www.daviddlewis.com/resources/testcollections/reuters21578/.
Weeber M, Vos R, Klein H, de Jong-van den Berg LTW (2001) Using Concepts In Literature-Based Discovery: Simulating Swanson’S Raynaud Fish Oil And Migraine Magnesium Discoveries. Journal of American Society for Information Science and Technology; 52 (7), pp.548–557.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Texas A&M University-commerce, Commerce, TX, USA
Mutlu Mete
Department of Applied Science, University of Arkansas at Little Rock, Little Rock, AR, USA
Nurcan Yuruk & Daniel Berleant
Department of Information Science, University of Arkansas at Little Rock, Little Rock, AR, USA
Xiaowei Xu

Authors

Mutlu Mete
View author publications
You can also search for this author in PubMed Google Scholar
Nurcan Yuruk
View author publications
You can also search for this author in PubMed Google Scholar
Xiaowei Xu
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Berleant
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. Systems Engineering, Donaghey College of Info Sci., University of Arkansas, South University Avenue 2801, Little Rock, 72204-1099, Arkansas, USA
Yupo Chan
Dept. Information Science, University of Arkansas, Little Rock, South University Ave. 2801, Little Rock, 72204-1099, USA
John Talburt
Acxiom Corporation, Conway, USA
Terry M. Talley

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Mete, M., Yuruk, N., Xu, X., Berleant, D. (2009). Knowledge Discovery in Textual Databases: A Concept-Association Mining Approach. In: Chan, Y., Talburt, J., Talley, T. (eds) Data Engineering. International Series in Operations Research & Management Science, vol 132. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-0176-7_11

Download citation

DOI: https://doi.org/10.1007/978-1-4419-0176-7_11
Published: 05 September 2009
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4419-0175-0
Online ISBN: 978-1-4419-0176-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics