Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Ye, Feiyue; Xu, Xinchen

doi:10.1007/s12204-018-1957-2

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Published: 07 June 2018

Volume 23, pages 584–592, (2018)
Cite this article

Journal of Shanghai Jiaotong University (Science) Aims and scope Submit manuscript

Feiyue Ye (叶飞跃)¹ &
Xinchen Xu (徐欣辰)¹

107 Accesses
2 Citations
Explore all metrics

Abstract

As a fundamental and effective tool for document understanding and organization, multi-document summarization enables better information services by creating concise and informative reports for large collections of documents. In this paper, we propose a sentence-word two layer graph algorithm combining with keyword density to generate the multi-document summarization, known as Graph & Keywordρ. The traditional graph methods of multi-document summarization only consider the influence of sentence and word in all documents rather than individual documents. Therefore, we construct multiple word graph and extract right keywords in each document to modify the sentence graph and to improve the significance and richness of the summary. Meanwhile, because of the differences in the words importance in documents, we propose to use keyword density for the summaries to provide rich content while using a small number of words. The experiment results show that the Graph & Keywordρ method outperforms the state of the art systems when tested on the Duc2004 data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Graph-based extractive text summarization based on single document

Article 25 July 2023

Intra-document and Inter-document Redundancy in Multi-document Summarization

Context-Based Multi-document Summarization

References

CHAO S, Tao L. Multi-document summarization via the minimum dominating set [C]//Proceedings of the 23rd International Conference on Computational Linguistics. Beijing: ACM, 2010: 984–992.
Google Scholar
BHARTI S K, BABU K S, PRADHAN A. Automatic keyword extraction for text summarization in multidocument e-newspapers articles [J]. European Journal of Advances in Engineering and Technology, 2017, 4(6): 410–427.
Google Scholar
MA L, HE T, LI F, et al. Query-focused multidocument summarization using keyword extraction [C]//Proceedings of 2008 International Conference on Computer Science and Software Engineering. Wuhan: IEEE, 2008: 20–23.
Google Scholar
LITVAK M, LAST M. Graph-based keyword extraction for single-document summarization [C]//Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization. Manchester, UK: ACM, 2008: 17–24.
Google Scholar
HONG K, CONROY J M, FAVRE B, et al. A repository of state of the art and competitive baseline summaries for generic news summarization [C]//Proceedings of the 9th International Conference on Language Resources and Evaluation. Reykjavik, Iceland: ELRA, 2014: 1608–1616.
Google Scholar
RADEV D R, JING H, STYS M, et al. Centroid-based summarization of multiple documents [J]. Information Processing & Management, 2004, 40(6): 919–938.
Article MATH Google Scholar
ERKAN G, RADEV D R. Lexrank: Graph-based lexical centrality as salience in text summarization [J]. Journal of Artificial Intelligence Research, 2004, 22(1): 457–479.
Article Google Scholar
WAN X, YANG J. Multi-document summarization using cluster-based link analysis [C]//Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Singapore: ACM, 2008: 299–306.
Google Scholar
WAN X, YANG J, XIAO J. Manifold-ranking based topic-focused multi-document summarization [C]// Proceedings of the 20th International Joint Conference on Artifical Intelligence. Hyderabad, India: Morgan Kaufmann Publishers Inc., 2007: 2903–2908.
Google Scholar
WAN X, XIAO J. Graph-based multi-modality learning for topic-focused multi-document summarization [C]//Proceedings of the 21th International Joint Conference on Artificial Intelligence. Pasadena, California, USA: Morgan Kaufmann Publishers Inc., 2009: 1586–1591.
Google Scholar
CAO Z, LI W, LI S, et al. Improving multi-document summarization via text classification [C]//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, California, USA: AAAI, 2017: 3053–3059.
Google Scholar
HADYAN F, SHAUFIAH BIJAKSANA M A. Comparison of document index graph using TextRank and HITS weighting method in automatic text summarization [J]. Journal of Physics: Conference Series, 2017, 801(1): 012076.
Google Scholar
XIONG C, LI Y, LV K. Multi-documents summarization based on the TextRank and its application in argumentation system [C]//Proceedings of the 5th International Conference on Emerging Internetworking, Data & Web Technologies. Wuhan, China: Springer, 2017: 457–466.
Google Scholar
YU S, SU J, LI P, et al. Towards high performance text mining: A TextRank-based method for automatic text summarization [J]. International Journal of Grid and High Performance Computing, 2016, 8(2): 58–75.
Article Google Scholar
BRITSOM D V, BRONSELAER A, TR´E G D. Using data merging techniques for generating multidocument summarizations [J]. IEEE Transactions on Fuzzy Systems, 2015, 23(3): 576–592.
Article Google Scholar
BARRIOS F, Ló PEZ F, ARGERICH L, et al. Variations of the similarity function of TextRank for automated summarization [EB/OL]. (2016-02-11). [2017-10-23]. https://arxio.org/pdf/1602.03606.pdf.
Google Scholar
AL-HASHEMI R. Text summarization extraction system (TSES) Using extracted keywords [J]. International Arab Journal of E-Technology, 2010, 1(4): 164–168.
Google Scholar
LIN C Y. ROUGE: A package for automatic evaluation of summaries [C]//Proceedings of Workshop on Text Summarization Branches Out. Barcelina, Spain: ACL, 2004.
Google Scholar
WANG D, ZHU S, LI T, et al. Integrating document clustering and multidocument summarization [J]. ACM Transactions on Knowledge Discovery from Data, 2011, 5(3): 1–26.
Article MathSciNet Google Scholar
KULESZA A, TASKAR B. Determinantal point processes for machine learning [J]. Foundations and Trends® in Machine Learning, 2012, 5(2/3): 123–286.
Article MATH Google Scholar
DAVIS S T, CONROY J M, SCHLESINGER J D. OCCAMS —An optimal combinatorial covering algorithm for multi-document summarization [C]//Proceedings of the 2012 IEEE 12th International Conference on Data Mining Workshops. Brussels, Belgium: IEEE, 2012: 454–463.
Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Engineering and Science, Shanghai University, Shanghai, 200444, China
Feiyue Ye (叶飞跃) & Xinchen Xu (徐欣辰)

Authors

Feiyue Ye (叶飞跃)
View author publications
You can also search for this author in PubMed Google Scholar
Xinchen Xu (徐欣辰)
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xinchen Xu (徐欣辰).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ye, F., Xu, X. Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs. J. Shanghai Jiaotong Univ. (Sci.) 23, 584–592 (2018). https://doi.org/10.1007/s12204-018-1957-2

Download citation

Received: 23 October 2017
Published: 07 June 2018
Issue Date: August 2018
DOI: https://doi.org/10.1007/s12204-018-1957-2

Key words

CLC number

TP 391

Document code

A

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Abstract

Access this article

Similar content being viewed by others

Graph-based extractive text summarization based on single document

Intra-document and Inter-document Redundancy in Multi-document Summarization

Context-Based Multi-document Summarization

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Key words

CLC number

Document code

Navigation

Automatic Multi-Document Summarization Based on Keyword Density and Sentence-Word Graphs

Abstract

Access this article

Similar content being viewed by others

Graph-based extractive text summarization based on single document

Intra-document and Inter-document Redundancy in Multi-document Summarization

Context-Based Multi-document Summarization

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Key words

CLC number

Document code

Search

Navigation