Skip to main content

Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction

  • Conference paper
  • First Online:
Digital Libraries for Open Knowledge (TPDL 2019)

Abstract

Effective learning with audiovisual content depends on many factors. Besides the quality of the learning resource’s content, it is essential to discover the most relevant and suitable video in order to support the learning process most effectively. Video summarization techniques facilitate this goal by providing a quick overview over the content. It is especially useful for longer recordings such as conference presentations or lectures. In this paper, we present a domain specific approach that generates a visual summary of video content using solely textual information. For this purpose, we exploit video annotations that are automatically generated by speech recognition and video OCR (optical character recognition). Textual information is represented by semantic word embeddings and extracted keyphrases. We demonstrate the feasibility of the proposed approach through its incorporation into the TIB AV-Portal (http://av.tib.eu/), which is a platform for scientific videos. The accuracy and usefulness of the generated video content visualizations is evaluated in a user study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://nlp.stanford.edu/software/tagger.shtml.

  2. 2.

    https://plot.ly/api/.

References

  1. Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. TACL 5, 135–146 (2017)

    Google Scholar 

  2. Boudin, F.: PKE: an open source python-based keyphrase extraction toolkit. In: International Conference on Computational Linguistics, Conference System Demonstrations, pp. 69–73. Osaka, Japan (2016)

    Google Scholar 

  3. Boudin, F.: Unsupervised keyphrase extraction with multipartite graphs. In: Conference for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, USA, pp. 667–672 (2018)

    Google Scholar 

  4. Chang, W., Yang, J., Wu, Y.: A keyword-based video summarization learning platform with multimodal surrogates. In: International Conference on Advanced Learning Technologies, Athens, Georgia, USA, pp. 37–41 (2011)

    Google Scholar 

  5. Elhamifar, E., Kaluza, M.C.D.P.: Online summarization via submodular and convex optimization. In: Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 1818–1826 (2017)

    Google Scholar 

  6. Florescu, C., Caragea, C.: Positionrank: an unsupervised approach to keyphrase extraction from scholarly documents. In: Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 1105–1115 (2017)

    Google Scholar 

  7. Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Meeting of the Association for Computational Linguistics, 22–27 June 2014, Baltimore, MD, USA, pp. 1262–1273 (2014)

    Google Scholar 

  8. Havre, S., Hetzler, E.G., Whitney, P., Nowell, L.T.: Themeriver: visualizing thematic changes in large document collections. IEEE Trans. Vis. Comput. Graph. 8(1), 9–20 (2002)

    Article  Google Scholar 

  9. Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, pp. 2714–2721 (2013)

    Google Scholar 

  10. Ma, Y., Hua, X., Lu, L., Zhang, H.: A generic framework of user attention model and its application in video summarization. IEEE Trans. Multimedia 7(5), 907–919 (2005)

    Article  Google Scholar 

  11. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp. 3111–3119 (2013)

    Google Scholar 

  12. Over, P., Smeaton, A.F., Awad, G.: The TRECVid 2008 BBC rushes summarization evaluation. In: ACM Workshop on Video Summarization, Vancouver, British Columbia, Canada, pp. 1–20 (2008)

    Google Scholar 

  13. Paley, W.B.: Textarc: showing word frequency and distribution in text. In: Poster presented at IEEE Symposium on Information Visualization, vol. 2002 (2002)

    Google Scholar 

  14. Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1532–1543 (2014)

    Google Scholar 

  15. Wang, M., Hong, R., Li, G., Zha, Z., Yan, S., Chua, T.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimedia 14(4), 975–985 (2012)

    Article  Google Scholar 

  16. Yanai, K., Barnard, K.: Image region entropy: a measure of “visualness” of web images associated with one concept. In: ACM International Conference on Multimedia, Singapore, pp. 419–422 (2005)

    Google Scholar 

  17. Zhao, B., Li, X., Lu, X.: HSA-RNN: hierarchical structure-adaptive RNN for video summarization. In: Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 7405–7414 (2018)

    Google Scholar 

Download references

Acknowledgments

Part of this work is financially supported by the Leibniz Association, Germany (Leibniz Competition 2018, funding line “Collaborative Excellence”, project SALIENT [K68/2017]).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christian Otto .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhou, H., Otto, C., Ewerth, R. (2019). Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction. In: Doucet, A., Isaac, A., Golub, K., Aalberg, T., Jatowt, A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science(), vol 11799. Springer, Cham. https://doi.org/10.1007/978-3-030-30760-8_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-30760-8_28

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-30759-2

  • Online ISBN: 978-3-030-30760-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics