Skip to main content

Knowledge Discovery in Textual Databases: A Concept-Association Mining Approach

  • Chapter
  • First Online:
Data Engineering

Part of the book series: International Series in Operations Research & Management Science ((ISOR,volume 132))

  • 2849 Accesses

Abstract

The number of scientific publications is exploding as online digital libraries and the World Wide Web grow. MEDLINE, the premier bibliographic database of the National Library of Medicine (NLM) , contains about 18 million records from more than 7,300 different publications dating from 1965; it is growing by about 400,000 citations each year. The explosive growth of information in textual documents creates great need for techniques for knowledge discovery from text collections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Agrawal R and Srikant R (1994) Fast Algorithms for Mining Association Rules. 20th International Conference on Very Large Data Bases, Santiago de Chile, Chile, pp 487-499.

    Google Scholar 

  • Agrawal R, Imielinski T, and Swami A (1993) Mining Association Rules Between Sets of Items in Large Database. ACM SIGMOD Conference, pp. 207-216.

    Google Scholar 

  • Beil F, Ester M, and Xu X (2002) Frequent Term-Based Text Clustering. 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Edmonton, Alberta, Canada, pp 436-442.

    Google Scholar 

  • BioMed Central text corpus, http://www.biomedcentral.com/info/about/datamining/.

  • Bodon F (2003) A Fast APRIORI Implementation. IEEE ICDM Workshop on Frequent Itemset Mining Implementations, Melbourne, Florida, pp 56-65.

    Google Scholar 

  • Brin S, Motwani R, and Silverstein C (1997) Beyond Market Basket: Generalizing Association Rules to Correlations. ACM SIGMOD Conference, Tucson, Arizona, pp. 265-276.

    Google Scholar 

  • Brin S, Motwani R, Ullman J, and Tsur, S (1997) Dynamic Itemset Counting and Implication Rules for Market Basket Data ACM SIGMOD Conference, Tucson, Arizona, pp. 255-264.

    Google Scholar 

  • College of Biological Sciences, http://www.biosci.ohio-state.edu/∼parasite/plasmodium.html.

    Google Scholar 

  • Fayyad U, Piatetsky-Shapiro G, and Smyth P (1996) Knowledge Discovery and Data Mining: Towards a Unifying Framework, Second Int. Conf. on Knowledge Discovery and Data Mining, Portland, OR, pp. 82-88.

    Google Scholar 

  • Feldman R and Dagan I (1995) Knowledge Discovery in Textual Databases (KDT). First international conference on knowledge discovery (KDD'95), Montreal, pp 112-117.

    Google Scholar 

  • Feldman R, and Hirsh H (1996) Mining Associations In Text In The Presence Of Background Knowledge. 2nd International Conference on Knowledge Discovery and Data Mining, pp. 343-346.

    Google Scholar 

  • Feldman R, Dagan I, and Hirsh H (1998) Mining Text Using Keyword Distributions. Journal of Intelligent Information Systems: Integrating Artificial Intelligence and Database Technologies, 10(3), pp. 281-300.

    Google Scholar 

  • Health Information Main Page, http://www.niams.nih.gov/hi/topics/psoriasis/psoriafs.htm.

  • Hearst MA (1999) Untangling Text Data Mining. 37th Annual Meeting of the Association for Computational Linguistics (ACL'99), University of Maryland, pp 3-10.

    Google Scholar 

  • Hipp J, Guntzer U, and Nakhaeizadeh G (2000) Algorithms for Association Rule Mining – A General Survey and Comparison, ACM SIGKDD Explorations, Vol.2, pp. 58-64.

    Article  Google Scholar 

  • Hospital for Special Surgery, Orthopedic Surgery, http://orthopaedics.hss.edu/services/conditions/hip/dv_thrombosis.asp.

  • Information For Health Professionals http://allhat.sph.uth.tmc.edu.

  • Krovetz R (1993) Viewing Morphology as an Inference Process. 16th ACM SIGIR Conference, Pittsburgh, pp 191-202.

    Google Scholar 

  • Lewis D (1992) An Evaluation Of Phrasal And Clustered Representations On A Text Categorization Problem. ACM-SIGIR Conference on Information Retrieval, Copenhagen, Denmark, pp 37-50.

    Google Scholar 

  • Lin SH, Shih CS, and Chen MC (1998) Extracting Classification Knowledge of Internet Documents with Mining Term Associations: A Semantic Approach. 21st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Melbourne, Australia, pp 241-249.

    Google Scholar 

  • Liu B, Hsu W, and Ma YM (1998) Integrating Classification and Association Rule Mining. The Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), New York City, pp. 80-86.

    Google Scholar 

  • Liu B, Hsu W, and Ma YM (1997) Mining Association Rules with Multiple Minimum Supports. The Fourth International Conference on Knowledge Discovery and Data Mining, pp. 337-341.

    Google Scholar 

  • Loh S, Wives LK, and Oliveia JPM (2000) Concept Based Knowledge Discovery from Texts Extracted from the Web. ACM SIGKDD Explorations, vol. 2, pp. 29-40.

    Article  Google Scholar 

  • Morishita S and Sese J (2000) Traversing Itemset Lattices with Statistical Metric Pruning. In Proc. of the 19th ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems,. ACM Press, pp. 226-236.

    Google Scholar 

  • National Center for Biotechnology Information-A, retrieved from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&dopt=Abstract&list_uids=9662402.

  • National Center for Biotechnology Information-B, retrieved from http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=PubMed&list_uids=12150501&dopt=Abstract.

  • Omiecinski E (2003) Alternative Interest Measures for Mining Associations. IEEE Trans. Knowledge and Data Engineering, 15(1), pp. 57-69.

    Article  MathSciNet  Google Scholar 

  • PubMed Central, retrieved from http://www.pubmedcentral.nih.gov/.

  • Reuters-21578 Text Categorization Text Collection retrieved from http://www.daviddlewis.com/resources/testcollections/reuters21578/.

  • Weeber M, Vos R, Klein H, de Jong-van den Berg LTW (2001) Using Concepts In Literature-Based Discovery: Simulating Swanson’S Raynaud Fish Oil And Migraine Magnesium Discoveries. Journal of American Society for Information Science and Technology; 52 (7), pp.548–557.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Mete, M., Yuruk, N., Xu, X., Berleant, D. (2009). Knowledge Discovery in Textual Databases: A Concept-Association Mining Approach. In: Chan, Y., Talburt, J., Talley, T. (eds) Data Engineering. International Series in Operations Research & Management Science, vol 132. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-0176-7_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-0176-7_11

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-0175-0

  • Online ISBN: 978-1-4419-0176-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics