Skip to main content

Extractive Summarization Utilizing Keyphrases by Finetuning BERT-Based Model

  • Conference paper
  • First Online:
From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries (ICADL 2022)

Abstract

Summarization is a natural language processing (NLP) task of producing a brief text, which provide a compressed text that contains the main content and key information of the source document. Both extractive summarization and keyphrase extraction are the tasks that extract shorter texts keeping salient information and main points from the source document. Compared with keyphrases, summaries composed of sentences are larger granular texts that have high probability of being related to the keyphrases of the document. On one hand, previous work lacks research on whether keyphrases are beneficial for extracting important sentences. On the other hand, with the development of deep neural network, pretrained language models, especially BERT-based models which can adapt to various natural language processing (NLP) tasks by finetuning, have attracted extensive attention. For these reasons, we propose KeyBERTSUM, in which we try to leverage keyphrases in the extractive summarization task based on a BERT encoder, guiding the model focusing on the important contents instead of the entire document. In addition, we also introduce the confidence of guiding phrases in sentence updating. Experimental evaluations of our methods on CNN/Daily Mail New York Times 50 and DUC2001 datasets have shown improvement on ROUGE scores over baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bi K, Jha R, Croft W B, et al. AREDSUM: adaptive redundancy-aware itera- tive sen-tence ranking for extractive document summarization. arXiv preprint arXiv:2004.06176, 2020

  2. Cheng J, Lapata M. Neural summarization by extracting sentences and words. arXiv preprint arXiv:1603.07252, 2016

  3. Chung J, Gulcehre C, Cho K H, et al. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014

  4. Cui P, Hu L, Liu Y. Enhancing extractive text summarization with topic-aware graph neural networks. arXiv preprint arXiv:2010.06253, 2020

  5. Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional trans- formers for language understanding. arXiv preprint arXiv:1810.04805, 2018

  6. Dou Z Y, Liu P, Hayashi H, et al. Gsum: A general framework for guided neural abstractive summarization. arXiv preprint arXiv:2010.08014, 2020

  7. Genest P E, Lapalme G. Fully abstractive approach to guided summarization. Proceedings of the 50th Annual Meeting of the Association for Computational Lin- guistics (Volume 2: Short Papers). 2012: 354–358

    Google Scholar 

  8. Kedzie C, McKeown K, Daume III H. Content selection in deep learning models of summarization. arXiv preprint arXiv:1810.12343, 2018

  9. Kryściński W, Keskar N S, McCann B, et al. Neural text summarization: A critical evaluation. arXiv preprint arXiv:1908.08960, 2019

  10. Li C, Xu W, Li S, et al. Guiding generation for abstractive text summarization based on key information guide network//Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 2 (Short Papers). 2018: 55–60

    Google Scholar 

  11. Liang X, Wu S, Li M, et al. Unsupervised keyphrase extraction by jointly modeling local and global context[J]. arXiv preprint arXiv:2109.07293, 2021

  12. Lim, Y., Seo, D., Jung, Y.: Fine-tuning BERT Models for Keyphrase Extraction in Scientific Articles[J]. Journal of advanced information technology and convergence 10(1), 45–56 (2020)

    Article  Google Scholar 

  13. Lin C Y. Rouge: A package for automatic evaluation of summaries//Text sum- marization branches out. 2004: 74–81

    Google Scholar 

  14. Liu Y. Fine-tune BERT for extractive summarization. arXiv preprint arXiv:1903.10318, 2019

  15. Liu T, Iwaihara M. Supervised learning of keyphrase extraction utilizing prior summarization//International Conference on Asian Digital Libraries. Springer, Cham, 2021: 157–166

    Google Scholar 

  16. Mihalcea R, Tarau P. Textrank: Bringing order into text//Proceedings of the 2004 conference on empirical methods in natural language processing. 2004: 404–411

    Google Scholar 

  17. Nallapati R, Zhai F, Zhou B. Summarunner: A recurrent neural networkbased sequence model for extractive summarization of documents//Thirty-first AAAI conference on artificial intelligence. 2017

    Google Scholar 

  18. Sharma P, Li Y. Self-supervised contextual keyword and keyphrase retrieval with self-labelling (2019)

    Google Scholar 

  19. Veličković P, Cucurull G, Casanova A, et al. Graph attention networks. arXiv preprint arXiv:1710.10903, 2017

  20. Wan, X., Xiao, J.: Single document keyphrase extraction using neighborhood knowl- edge//AAAI. 8, 855–860 (2008)

    Google Scholar 

  21. Wang, R., Liu, W., McDonald, C.: Corpus-independent generic keyphrase extraction using word embedding vectors//Software engineering research conference. 39, 1–8 (2014)

    Google Scholar 

  22. Wang D, Liu P, Zheng Y, et al. Heterogeneous graph neural networks for extractive document summarization. arXiv preprint arXiv:2004.12393, 2020

  23. Zhang X, Wei F, Zhou M. HIBERT: Document level pre-training of hierar- chical bidi-rectional transformers for document summarization. arXiv preprint arXiv:1905.06566, 2019

  24. Zhong M, Liu P, Chen Y, et al. Extractive summarization as text matching. arXiv preprint arXiv:2004.08795, 2020

  25. Zhou Q, Yang N, Wei F, et al. Neural document summarization by jointly learning to score and select sentences. arXiv preprint arXiv:1807.02305, 2018

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Iwaihara Mizuho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xiaoye, W., Mizuho, I. (2022). Extractive Summarization Utilizing Keyphrases by Finetuning BERT-Based Model. In: Tseng, YH., Katsurai, M., Nguyen, H.N. (eds) From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries. ICADL 2022. Lecture Notes in Computer Science, vol 13636. Springer, Cham. https://doi.org/10.1007/978-3-031-21756-2_5

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-21756-2_5

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-21755-5

  • Online ISBN: 978-3-031-21756-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics