Retrieving News Stories from a News Integration Archive

Azcarraga, Arnulfo P.; Seng Chua, Tat; Tan, Jonathan

doi:10.1007/3-540-36227-4_22

Arnulfo P. Azcarraga⁶,
Tat Seng Chua⁶ &
Jonathan Tan⁶

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 2555))

Included in the following conference series:

International Conference on Asian Digital Libraries

1208 Accesses
1 Citations

Abstract

The distinctive features of the Bveritas online news integration archive are as follows: 1) automatic clustering of related news documents into themes; 2) organization of these news clusters in a theme map; 3) extraction of meaningful labels for each news cluster; and 4) generation of links to related news articles. Several ways of retrieving news stories from this Bveritas archive are described. The retrieval methods range from the usual query box and links to related stories, to an interactive world map that allows news retrieval by country, to an interactive theme map. Query and browsing are mediated by a Scatter/Gather interface that allows the user to select interesting clusters, out of which the subset of documents are gathered and re-clustered for the user to visually inspect. The user is then asked to select new interesting clusters. This alternating selection/clustering process continues until the user decides to view the individual news story titles.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Azcarraga, A., and Yap, T. Jr. (2001) Extracting Meaningful Labels for WEBSOM-Based Text Archives. 10th ACM International Conference on Information and Knowledge Management (CIKM 2001), Atlanta, USA.
Google Scholar
Azcarraga, A., Yap, T. Jr., and Chua, T.S. (2002) Comparing Keyword Extraction Techniques for WEBSOM Text Archives. International Journal of Artificial Intelligence Tools, Vol. 11, No 2.
Google Scholar
Cutting, D. et al. (1992) Scatter/Gather: A Cluster-based Approach to Browsing Large Document Collections. Proc 15th ACM/SIGIR, Copenhagen.
Google Scholar
Cutting, D. et al. (1993) Constant Interaction-Time Scatter/Gather Browsing of Large Document Collections. Proc 16th ACM/SIGIR, Pittsburg.
Google Scholar
Hearst, M. et al. (1995) Scatter/Gather as a Tool for the Navigation of Retrieval Results. Proc 1995 AAAI Fall Symposium on Knowledge Navigation.
Google Scholar
Hearst, M. et al. (1996) Reexamining the Cluster Hypothesis: Scatter/Gather on Retrieval Results. Proc 19th ACM/SIGIR, Zurich.
Google Scholar
Hearst, M. et al. (1996) Four TREC-4 Tracks: the Xerox Site Report. Proc 4th Text REtrieval Conference (TREC-4), Nov 1-3, Arlington, VA.
Google Scholar
Honkela, T. et al. (1997) WEBSOM-Self-Organizing Maps of Document Collections. Proc WSOM’97, Espoo, Finland.
Google Scholar
Kaski, S. et al. (1998) Statistical Aspects of the WEBSOM System in Organizing Document Collections. Computing Science and Statistics. Vol. 29. pp. 281–290.
Google Scholar
Kaski, S. (1998) Dimensionality reduction by random mapping: Fast similarity computation for clustering. Proc IJCNN’98, International Joint Conference on Neural Networks, Vol. 1, Piscataway, NJ.
Google Scholar
Kohonen, T. (1982) Analysis of a Simple Self-Organizing Process, Biological Cybernetics, Vol. 44, pp. 135–140.
Article MATH MathSciNet Google Scholar
Kohonen, T. (1988) Self-Organization and Associative Memory. Series in Information Sciences, Second Edition. Berlin, Springer-Verlag.
Google Scholar
Kohonen, T. (1995) Self-Organizing Maps. Berlin, Springer-Verlag.
Google Scholar
Kohonen, T. (1998) Self-Organization of Very Large Document Collections: State of the Art. Intl Conference on Artificial Neural Networks, ICANN98. Skovde, Sweden.
Google Scholar
Kohonen, T. et al. (2000) Self Organization of a Massive Document Collection, IEEE Trans on Neural Networks, Vol. 11, no 3, pp. 574–585.
Article Google Scholar
Merkl, D. and Rauber, A. (2000). Uncovering the Hierarchical Structure of Text Archives by Using an Unsupervised Neural Networks with Adaptive Architecture. PAKDD’2000. Kyoto, Japan.
Google Scholar
Lagus et al, (1999) WEBSOM for Textual Data Mining. Artificial Intelligence Review, Vol. 13, pp. 345–364.
Article Google Scholar
Rauber, A. and Merkl, D. (1999). Mining Text Archives: Creating Readable Maps to Structure and Describe Document Collections. PKDD99.
Google Scholar
Salton, G. (1989). Automatic Text Processing: The Transformation, Analysis, and Retrieval of Information by Computer. Addison-Wesley, Reading, MA.
Google Scholar

Download references

Author information

Authors and Affiliations

Program for Research into Intelligent Systems (PRIS) School of Computing, National University of Singapore, Lower Kent Ridge Road, 117543, Singapore
Arnulfo P. Azcarraga, Tat Seng Chua & Jonathan Tan

Authors

Arnulfo P. Azcarraga
View author publications
You can also search for this author in PubMed Google Scholar
Tat Seng Chua
View author publications
You can also search for this author in PubMed Google Scholar
Jonathan Tan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Nanyang Technological University, Singapore
Ee- Peng Lim , Schubert Foo & Chris Khoo , &
University of Arizona, USA
Hsinchun Chen
Virginia Tech, USA
Edward Fox
University of Mysore, Mysore
Shalini Urs
IEI-CNR, Italy
Thanos Costantino

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Azcarraga, A.P., Seng Chua, T., Tan, J. (2002). Retrieving News Stories from a News Integration Archive. In: Lim, E.P., et al. Digital Libraries: People, Knowledge, and Technology. ICADL 2002. Lecture Notes in Computer Science, vol 2555. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-36227-4_22

Download citation

DOI: https://doi.org/10.1007/3-540-36227-4_22
Published: 16 December 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-00261-1
Online ISBN: 978-3-540-36227-2
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics