Skip to main content

Query-Focused Summarization by Combining Topic Model and Affinity Propagation

  • Conference paper
Advances in Data and Web Management (APWeb 2009, WAIM 2009)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 5446))

Abstract

The goal of query-focused summarization is to extract a summary for a given query from the document collection. Although much work has been done for this problem, there are still many challenging issues: (1) The length of the summary is predefined by, for example, the number of word tokens or the number of sentences. (2) A query usually asks for information of several perspectives (topics); however existing methods cannot capture topical aspects with respect to the query. In this paper, we propose a novel approach by combining statistical topic model and affinity propagation. Specifically, the topic model, called qLDA, can simultaneously model documents and the query. Moreover, the affinity propagation can automatically discover key sentences from the document collection without predefining the length of the summary. Experimental results on DUC05 and DUC06 data sets show that our approach is effective and the summarization performance is better than baseline methods.

The work is supported by NSFC (60703059), Chinese National Key Foundation Research and Development Plan (2007CB310803), and Chinese Young Faculty Funding (20070003093).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Barzilay, R., Lee, L.: Catching the drift: probabilistic content models, with applications to generation and summarization. In: Proceedings of HLT-NAACL 2004 (2004)

    Google Scholar 

  2. Bhandari, H., Shimbo, M., Ito, T., Matsumoto, Y.: Generic text summarization using probabilistic latent semantic indexing. In: Proceedings of IJCNLP 2008 (2008)

    Google Scholar 

  3. Blei, D., Ng, A.Y., Jordan, M.I.: Latent Dirichlet Allocation. JMLR 3, 993–1022 (2003)

    MATH  Google Scholar 

  4. Blei, D., Griffiths, T., Jordan, M., Tenenbaum, J.: Hierarchical topic models and the nested Chinese restaurant process. In: Proceedings of NIPS 2004 (2004)

    Google Scholar 

  5. Chen, B., Chen, Y.: Word Topical Mixture Models for Extractive Spoken Document Summarization. In: Proceedings of ICME 2007 (2007)

    Google Scholar 

  6. Conroy, J., Schlesinger, J., O’Leary, D.: Topic Focused Multi-document Summarization Using an Approximate Oracle Score. In: Proceedings of ACL 2006 (2006)

    Google Scholar 

  7. Daumé III, H., Marcu, D.: Bayesian Query-Focused Summarization. In: Proceedings of ACL 2006 (2006)

    Google Scholar 

  8. Frey, B., Dueck, D.: Clustering by passing messages between data points. Science 315(5814), 972–976 (2007)

    Article  MathSciNet  MATH  Google Scholar 

  9. Griffiths, T., Steyvers, M.: Finding scientific topics. In: Proceedings of NAS, pp. 5228–5235 (2004)

    Google Scholar 

  10. Harabagiu, S., Lacatusu, F.: Topic Themes for Multi-Document Summarization. In: Proceedings of SIGIR 2005 (2005)

    Google Scholar 

  11. Hofmann, T.: Probabilistic latent semantic indexing. In: Proceedings of SIGIR 1999 (1999)

    Google Scholar 

  12. Kong, S., Lee, L.: Improved Spoken Document Summarization Using Probabilistic Latent Semantic Analysis (PLSA). In: Proceedings of ICASS 2006 (2006)

    Google Scholar 

  13. Kumar, R., Mahadevan, U., Sivakumar, D.: A graph-theoretic approach to extract storylines from search results. In: Proceedings of KDD 2004, pp. 216–225 (2004)

    Google Scholar 

  14. Kullback, S., Leibler, R.A.: On information and sufficiency. Annals of Mathematical Statistics, vol. 22, pp. 79–86 (1951)

    Google Scholar 

  15. Li, W., Li, W., Li, B., Chen, Q., Wu, M.: The Hong Kong Polytechnic University at DUC2005. In: Proceedings of DUC 2005 (2005)

    Google Scholar 

  16. Lin, C., Hovy, E.: The Automatic Acquisition of Topic Signatures for Text Summarization. In: Proceedings of COLING 2000 (2000)

    Google Scholar 

  17. Lin, C., Hovy, E.: Automatic evaluation of summaries using N-gram co-occurrence statistics. In: Proceedings of HLT-NAACL 2003 (2003)

    Google Scholar 

  18. Mei, Q., Ling, X., Wondra, M., Su, H., Zhai, C.: Topic Sentiment Mixture: Modeling Facets and Opinions in Weblogs. In: Proceedings of WWW 2007 (2007)

    Google Scholar 

  19. Nenkova, A., Vanderwende, L., McKeown, K.: A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In: Proceedings of SIGIR 2006 (2006)

    Google Scholar 

  20. Page, L., Brin, S., Motwani, R., Winograd, T.: PageRank Bringing Order to the Web. Stanford University (1999)

    Google Scholar 

  21. Steyvers, M., Smyth, P., Griffiths, T.: Probabilistic author topic models for information discovery. In: Proceedings of SIGKDD 2004, pp. 306–315 (2004)

    Google Scholar 

  22. Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: Proceedings of SDM 2009 (2009)

    Google Scholar 

  23. Tang, J., Zhang, J., Yao, L., Li, J., Zhang, L., Su, Z.: ArnetMiner: extraction and mining of academic social networks. In: Proceedings of SIGKDD 2008, pp. 990–998 (2008)

    Google Scholar 

  24. Wei, X., Bruce Croft, W.: LDA-based document models for Ad-hoc retrieval. In: Proceedings of SIGIR 2006 (2006)

    Google Scholar 

  25. Ye, S., Qiu, L., Chua, T., Kan, M.: NUS at DUC2005: Understanding documents via concept links. In: Proceedings of DUC 2005 (2005)

    Google Scholar 

  26. Yih, W., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: Proceedings of IJCAI 2007 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Chen, D., Tang, J., Yao, L., Li, J., Zhou, L. (2009). Query-Focused Summarization by Combining Topic Model and Affinity Propagation. In: Li, Q., Feng, L., Pei, J., Wang, S.X., Zhou, X., Zhu, QM. (eds) Advances in Data and Web Management. APWeb WAIM 2009 2009. Lecture Notes in Computer Science, vol 5446. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-00672-2_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-00672-2_17

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-00671-5

  • Online ISBN: 978-3-642-00672-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics