Skip to main content

Retrieving News Stories from a News Integration Archive

  • Conference paper
  • First Online:
Digital Libraries: People, Knowledge, and Technology (ICADL 2002)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2555))

Included in the following conference series:

Abstract

The distinctive features of the Bveritas online news integration archive are as follows: 1) automatic clustering of related news documents into themes; 2) organization of these news clusters in a theme map; 3) extraction of meaningful labels for each news cluster; and 4) generation of links to related news articles. Several ways of retrieving news stories from this Bveritas archive are described. The retrieval methods range from the usual query box and links to related stories, to an interactive world map that allows news retrieval by country, to an interactive theme map. Query and browsing are mediated by a Scatter/Gather interface that allows the user to select interesting clusters, out of which the subset of documents are gathered and re-clustered for the user to visually inspect. The user is then asked to select new interesting clusters. This alternating selection/clustering process continues until the user decides to view the individual news story titles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Azcarraga, A., and Yap, T. Jr. (2001) Extracting Meaningful Labels for WEBSOM-Based Text Archives. 10th ACM International Conference on Information and Knowledge Management (CIKM 2001), Atlanta, USA.

    Google Scholar 

  2. Azcarraga, A., Yap, T. Jr., and Chua, T.S. (2002) Comparing Keyword Extraction Techniques for WEBSOM Text Archives. International Journal of Artificial Intelligence Tools, Vol. 11, No 2.

    Google Scholar 

  3. Cutting, D. et al. (1992) Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. Proc 15th ACM/SIGIR, Copenhagen.

    Google Scholar 

  4. Cutting, D. et al. (1993) Constant Interaction-Time Scatter/Gather Browsing of Large Document Collections. Proc 16th ACM/SIGIR, Pittsburg.

    Google Scholar 

  5. Hearst, M. et al. (1995) Scatter/Gather as a Tool for the Navigation of Retrieval Results. Proc 1995 AAAI Fall Symposium on Knowledge Navigation.

    Google Scholar 

  6. Hearst, M. et al. (1996) Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results. Proc 19th ACM/SIGIR, Zurich.

    Google Scholar 

  7. Hearst, M. et al. (1996) Four TREC-4 Tracks: the Xerox Site Report. Proc 4th Text REtrieval Conference (TREC-4), Nov 1-3, Arlington, VA.

    Google Scholar 

  8. Honkela, T. et al. (1997) WEBSOM-Self-Organizing Maps of Document Collections. Proc WSOM’97, Espoo, Finland.

    Google Scholar 

  9. Kaski, S. et al. (1998) Statistical Aspects of the WEBSOM System in Organizing Document Collections. Computing Science and Statistics. Vol. 29. pp. 281–290.

    Google Scholar 

  10. Kaski, S. (1998) Dimensionality reduction by random mapping: Fast similarity computation for clustering. Proc IJCNN’98, International Joint Conference on Neural Networks, Vol. 1, Piscataway, NJ.

    Google Scholar 

  11. Kohonen, T. (1982) Analysis of a Simple Self-Organizing Process, Biological Cybernetics, Vol. 44, pp. 135–140.

    Article  MATH  MathSciNet  Google Scholar 

  12. Kohonen, T. (1988) Self-Organization and Associative Memory. Series in Information Sciences, Second Edition. Berlin, Springer-Verlag.

    Google Scholar 

  13. Kohonen, T. (1995) Self-Organizing Maps. Berlin, Springer-Verlag.

    Google Scholar 

  14. Kohonen, T. (1998) Self-Organization of Very Large Document Collections: State of the Art. Intl Conference on Artificial Neural Networks, ICANN98. Skovde, Sweden.

    Google Scholar 

  15. Kohonen, T. et al. (2000) Self Organization of a Massive Document Collection, IEEE Trans on Neural Networks, Vol. 11, no 3, pp. 574–585.

    Article  Google Scholar 

  16. Merkl, D. and Rauber, A. (2000). Uncovering the Hierarchical Structure of Text Archives by Using an Unsupervised Neural Networks with Adaptive Architecture. PAKDD’2000. Kyoto, Japan.

    Google Scholar 

  17. Lagus et al, (1999) WEBSOM for Textual Data Mining. Artificial Intelligence Review, Vol. 13, pp. 345–364.

    Article  Google Scholar 

  18. Rauber, A. and Merkl, D. (1999). Mining Text Archives: Creating Readable Maps to Structure and Describe Document Collections. PKDD99.

    Google Scholar 

  19. Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, MA.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2002 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Azcarraga, A.P., Seng Chua, T., Tan, J. (2002). Retrieving News Stories from a News Integration Archive. In: Lim, E.P., et al. Digital Libraries: People, Knowledge, and Technology. ICADL 2002. Lecture Notes in Computer Science, vol 2555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36227-4_22

Download citation

  • DOI: https://doi.org/10.1007/3-540-36227-4_22

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-00261-1

  • Online ISBN: 978-3-540-36227-2

  • eBook Packages: Springer Book Archive

Publish with us

Policies and ethics