Skip to main content

Term Committee Based Event Identification and Dependency Discovery

  • Conference paper
  • First Online:
Trustworthy Computing and Services (ISCTCS 2013)

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 426))

Included in the following conference series:

  • 1173 Accesses

Abstract

With the overwhelming volume of news stories created and stored electronically everyday, there is an increasing need for techniques to analyze and present news stories to the users in a more meaningful manner. Most previous research focus on organizing news set into flat collections (topics) of stories. However, a topic in news is more than a mere collection of stories: it is actually characterized by a definite structure of inter-related events. Unfortunately, it is very difficult to identify events and dependencies within a topic because stories about the same topic are usually very similar to each other irrespective of the events they belong to. This is because stories within a topic usually share some terms which are related to the topic other than a specific event. To deal with this problem, we propose two methods based on event key terms to identify events and discover event dependency accurately. For event identification, we first capture some tight term clusters as term committees of potential events, and then use them to find the core story sets of potential events. At last we assign all stories to an event. For event dependency discovery, we emphasize the terms closely related to a certain event. So similarity contributed by topic-popular terms can be decreased. The experimental results on two Linguistic Data Consortium (LDC) datasets show that both the proposed methods for event identification and event dependency discovery have significant improvement over previous methods.

Categories and Subject Descriptors: H.3.3 [Information Systems]: Information Search and Retrieval; H. 4.2 [Information Systems Applications]: Types of Systems – decision support.

General Terms: Algorithms, Experimentation

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. http://www.nist.gov/speech/tests/tdt/index.htm

  2. In: Topic Detection and Tracking. Event-based Information Organization. Kluwer Academic Publishers (2002)

    Google Scholar 

  3. Yang, Y., Carbonell, J., Brown, R., Pierce, T., Archibald, B.T., Liu, X.: Learning approaches for detecting and tracking news events. IEEE Intell. Syst. Spec. Issue Appl. Intell. Inf. Retr. 14(4), 32–43 (1999)

    Article  Google Scholar 

  4. Nallapati, R., Feng, A., Peng, F., Allan, J.: Event threading within news topics. In: CIKM’04, Washington, DC, USA, 8–13 Nov 2004, pp. 446–453 (2004)

    Google Scholar 

  5. Yang, Y. , Pierce, T., Carbonell, J.: A Study on retrospective and on-line event detection. In: Proceedings of SIGIR-98, Melbourne, Australia, pp. 28–36 (1998)

    Google Scholar 

  6. Papka, R., Allan, J.: On-line new event detection using single pass clustering TITLE2: Technical report UM-CS-1998-021 (1998)

    Google Scholar 

  7. Juha, M., Helena, A.M., Marko, S.: Simple semantics in topic detection and tracking. Inf. Retr. 7(3–4), 347–368 (2004)

    Google Scholar 

  8. Allan, J., Feng, A., Bolivar, A.: Flexible intrinsic evaluation of hierarchical clustering for TDT. In: the Proceedings of the ACM Twelfth International Conference on Information and Knowledge Management, pp. 263–270 (2003)

    Google Scholar 

  9. Li, Z., Wang, B., Li, M., Ma, W.: A probabilistic model for retrospective news event detection. In: Proceedings of ACM SIGIR’05, pp. 61–81 (2005)

    Google Scholar 

  10. Fung, G., Yu, J., Yu, P., Lu, H.: Parameter free bursty events detection in text streams. In: Proceedings of the 31st VLDB Conference, Trondheim, Norway, pp. 181–192 (2005)

    Google Scholar 

  11. Lam, W., Meng, H., Wong, K., Yen, J.: Using contextual analysis for news event detection. Int. J. Intell. Syst. 16(4), 525–546 (2001)

    Article  MATH  Google Scholar 

  12. Juha, M.: Investigations on event evolution in TDT. In: Proceedings of HLT-NAACL 2003 Student Workshop, pp. 43–48 (2004)

    Google Scholar 

  13. Pantel, P., Lin, D.: Document clustering with committees. In: Proceedings of the 25th Annual International ACM SIGIR Conference, Tampere, Finland, pp. 199–206 (2002)

    Google Scholar 

  14. Callan, J.P., Croft, W.B., Harding, S.M.: The INQUERY retrieval system. In: Proceedings of DEXA-92, 3rd International Conference on Database and Expert Systems Applications, pp. 78–83 (1992)

    Google Scholar 

  15. Krovetz, R.: Viewing morphology as an inference process. In: Proceedings of ACM SIGIR93, pp. 61–81 (1993)

    Google Scholar 

  16. The linguistic data consortium. http://www.ldc.upenn.edu/

Download references

Acknowledgement

This work was supported by National High Technology Research and Development (863) Program (2011AA01A205). Any opinions, findings and conclusions or recommendations expressed in this material are the author(s) and do not necessarily reflect those of the sponsor. And I also want to thank Dr. Nallapati for his help and the valuable annotation results of the dataset.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Kuo Zhang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Zhang, K., Li, J., Wu, G. (2014). Term Committee Based Event Identification and Dependency Discovery. In: Yuan, Y., Wu, X., Lu, Y. (eds) Trustworthy Computing and Services. ISCTCS 2013. Communications in Computer and Information Science, vol 426. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-43908-1_40

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-43908-1_40

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-43907-4

  • Online ISBN: 978-3-662-43908-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics