Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction

Zhou, Hang; Otto, Christian; Ewerth, Ralph

doi:10.1007/978-3-030-30760-8_28

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11799))

Included in the following conference series:

International Conference on Theory and Practice of Digital Libraries

1796 Accesses
1 Altmetric

Abstract

Effective learning with audiovisual content depends on many factors. Besides the quality of the learning resource’s content, it is essential to discover the most relevant and suitable video in order to support the learning process most effectively. Video summarization techniques facilitate this goal by providing a quick overview over the content. It is especially useful for longer recordings such as conference presentations or lectures. In this paper, we present a domain specific approach that generates a visual summary of video content using solely textual information. For this purpose, we exploit video annotations that are automatically generated by speech recognition and video OCR (optical character recognition). Textual information is represented by semantic word embeddings and extracted keyphrases. We demonstrate the feasibility of the proposed approach through its incorporation into the TIB AV-Portal (http://av.tib.eu/), which is a platform for scientific videos. The accuracy and usefulness of the generated video content visualizations is evaluated in a user study.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 84.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Bojanowski, P., Grave, E., Joulin, A., Mikolov, T.: Enriching word vectors with subword information. TACL 5, 135–146 (2017)
Google Scholar
Boudin, F.: PKE: an open source python-based keyphrase extraction toolkit. In: International Conference on Computational Linguistics, Conference System Demonstrations, pp. 69–73. Osaka, Japan (2016)
Google Scholar
Boudin, F.: Unsupervised keyphrase extraction with multipartite graphs. In: Conference for Computational Linguistics: Human Language Technologies, New Orleans, Louisiana, USA, pp. 667–672 (2018)
Google Scholar
Chang, W., Yang, J., Wu, Y.: A keyword-based video summarization learning platform with multimodal surrogates. In: International Conference on Advanced Learning Technologies, Athens, Georgia, USA, pp. 37–41 (2011)
Google Scholar
Elhamifar, E., Kaluza, M.C.D.P.: Online summarization via submodular and convex optimization. In: Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, pp. 1818–1826 (2017)
Google Scholar
Florescu, C., Caragea, C.: Positionrank: an unsupervised approach to keyphrase extraction from scholarly documents. In: Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp. 1105–1115 (2017)
Google Scholar
Hasan, K.S., Ng, V.: Automatic keyphrase extraction: a survey of the state of the art. In: Meeting of the Association for Computational Linguistics, 22–27 June 2014, Baltimore, MD, USA, pp. 1262–1273 (2014)
Google Scholar
Havre, S., Hetzler, E.G., Whitney, P., Nowell, L.T.: Themeriver: visualizing thematic changes in large document collections. IEEE Trans. Vis. Comput. Graph. 8(1), 9–20 (2002)
Article Google Scholar
Lu, Z., Grauman, K.: Story-driven summarization for egocentric video. In: Conference on Computer Vision and Pattern Recognition, Portland, OR, USA, pp. 2714–2721 (2013)
Google Scholar
Ma, Y., Hua, X., Lu, L., Zhang, H.: A generic framework of user attention model and its application in video summarization. IEEE Trans. Multimedia 7(5), 907–919 (2005)
Article Google Scholar
Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Conference on Neural Information Processing Systems, Lake Tahoe, Nevada, USA, pp. 3111–3119 (2013)
Google Scholar
Over, P., Smeaton, A.F., Awad, G.: The TRECVid 2008 BBC rushes summarization evaluation. In: ACM Workshop on Video Summarization, Vancouver, British Columbia, Canada, pp. 1–20 (2008)
Google Scholar
Paley, W.B.: Textarc: showing word frequency and distribution in text. In: Poster presented at IEEE Symposium on Information Visualization, vol. 2002 (2002)
Google Scholar
Pennington, J., Socher, R., Manning, C.D.: Glove: global vectors for word representation. In: Conference on Empirical Methods in Natural Language Processing, Doha, Qatar, pp. 1532–1543 (2014)
Google Scholar
Wang, M., Hong, R., Li, G., Zha, Z., Yan, S., Chua, T.: Event driven web video summarization by tag localization and key-shot identification. IEEE Trans. Multimedia 14(4), 975–985 (2012)
Article Google Scholar
Yanai, K., Barnard, K.: Image region entropy: a measure of “visualness” of web images associated with one concept. In: ACM International Conference on Multimedia, Singapore, pp. 419–422 (2005)
Google Scholar
Zhao, B., Li, X., Lu, X.: HSA-RNN: hierarchical structure-adaptive RNN for video summarization. In: Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, pp. 7405–7414 (2018)
Google Scholar

Download references

Acknowledgments

Part of this work is financially supported by the Leibniz Association, Germany (Leibniz Competition 2018, funding line “Collaborative Excellence”, project SALIENT [K68/2017]).

Author information

Authors and Affiliations

L3S Research Center, Leibniz Universität Hannover, Hannover, Germany
Hang Zhou & Ralph Ewerth
Leibniz Information Centre for Science and Technology (TIB), Hannover, Germany
Christian Otto & Ralph Ewerth

Authors

Hang Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Christian Otto
View author publications
You can also search for this author in PubMed Google Scholar
Ralph Ewerth
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christian Otto .

Editor information

Editors and Affiliations

University of La Rochelle, La Rochelle, France
Antoine Doucet
VU University Amsterdam, Amsterdam, The Netherlands
Antoine Isaac
Linnaeus University, Växjö, Sweden
Koraljka Golub
OsloMet – Oslo Metropolitan University, Oslo, Norway
Trond Aalberg
Kyoto University, Kyoto, Japan
Adam Jatowt

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhou, H., Otto, C., Ewerth, R. (2019). Visual Summarization of Scholarly Videos Using Word Embeddings and Keyphrase Extraction. In: Doucet, A., Isaac, A., Golub, K., Aalberg, T., Jatowt, A. (eds) Digital Libraries for Open Knowledge. TPDL 2019. Lecture Notes in Computer Science(), vol 11799. Springer, Cham. https://doi.org/10.1007/978-3-030-30760-8_28

Download citation

DOI: https://doi.org/10.1007/978-3-030-30760-8_28
Published: 30 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30759-2
Online ISBN: 978-3-030-30760-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics