Skip to main content

Topic Selection of Web Documents Using Specific Domain Ontology

  • Conference paper
MICAI 2006: Advances in Artificial Intelligence (MICAI 2006)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4293))

Included in the following conference series:

Abstract

This paper proposes a topic selection method for web documents using ontology hierarchy. The idea of this approach is to utilize the ontology structure in order to determine a topic in a web document. In this paper, we propose an approach for improving the performance of document clustering as we select the topic efficiently based on domain ontology. We preprocess the web documents for keywords extraction using Term Frequency formula and we build domain ontology as we branch off the partial hierarchy from WordNet using an automatic domain ontology building tool in preprocessing step. And we select a topic for the web documents based on domain ontology structure. Finally we realized that our approach contributes the efficient document clustering.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 239.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Chekuri, C., Goldwasser, M.H., Raghavan, P., Upfal, E.: Web Search Using Automated Classification. In: Poster at the Sixth International World Wide Web Conference (WWW6) (1997)

    Google Scholar 

  2. Gelbukh, A., Sidorov, G., Guzman, A.: Use of a Weighted Topic Hierarchy for Document Classification. In: Matoušek, V., Mautner, P., Ocelíková, J., Sojka, P. (eds.) TSD 1999. LNCS (LNAI), vol. 1692, pp. 130–135. Springer, Heidelberg (1999)

    Chapter  Google Scholar 

  3. Gövert, N., Lalmas, M., Fuhr, N.: A Probabilistic Description-Oriented Approach for Categorizing Web Document. In: Proceeding of the Eighth International Conference on Information Knowledge Management, Kansas City, MO USA, pp. 475–482 (1999)

    Google Scholar 

  4. Greiner, R., Grove, A., Schuurmans, D.: On learning hierarchical Classifications (1997)

    Google Scholar 

  5. Grobelnik, M., Mladenic, D.: Fast Categorization. In: Proceedings of Third International Conference on Knowledge Discovery Data Mining (1998)

    Google Scholar 

  6. Koller, D., Sahami, M.: Hierarchically Classifying Documents Using Very Few Words. In: The Proceeding of Machine Learning (ICML 1997), pp. 170–176 (1997)

    Google Scholar 

  7. Lee, J., Shin, D.: Multilevel Automatic Categorization for Webpages. In: The INET Proceeding 1998 (1998)

    Google Scholar 

  8. Lin, C.Y., Hovy, E.: Identifying Topics by Position. In: The Proceeding of The Workshop of Intelligent Scalable Text Summarization 1997 (1997)

    Google Scholar 

  9. Lin, C.Y.: Knowledge-based Automatic Topic Identification. In: The Proceeding of The 33rd Annual Meeting of the Association for Computational Linguistics 1995 (1995)

    Google Scholar 

  10. McCallum, A., Rosenfeld, R., Mitchell, T., Ng, Y.A.: Improving Text Classification by Shrinkage in a Hierarchy of Classes. In: Proceeding of the 15th Conference on Machine Learning (ICML-1998) (1998)

    Google Scholar 

  11. Quek, C.Y., Mitchell, T.: Classification of World Wide Web Documents. Seniors Honors Thesis, School of Computer Science, Carnegie Melon University (1998)

    Google Scholar 

  12. Scott, S., Matwin, S.: Text Classification using WordNet Hypernyms. In: The Proceeding of Workshop – Usage of WordNet in Natural Language Processing Systems, Montreal, Canada (1998)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kong, H., Hwang, M., Hwang, G., Shim, J., Kim, P. (2006). Topic Selection of Web Documents Using Specific Domain Ontology. In: Gelbukh, A., Reyes-Garcia, C.A. (eds) MICAI 2006: Advances in Artificial Intelligence. MICAI 2006. Lecture Notes in Computer Science(), vol 4293. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11925231_100

Download citation

  • DOI: https://doi.org/10.1007/11925231_100

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-49026-5

  • Online ISBN: 978-3-540-49058-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics