The CAMETRON Lecture Recording System: High Quality Video Recording and Editing with Minimal Human Supervision

Hulens, Dries; Aerts, Bram; Chakravarty, Punarjay; Diba, Ali; Goedemé, Toon; Roussel, Tom; Zegers, Jeroen; Tuytelaars, Tinne; Van Eycken, Luc; Van Gool, Luc; Van Hamme, Hugo; Vennekens, Joost

doi:10.1007/978-3-319-73603-7_42

Dries Hulens²¹,
Bram Aerts²¹,
Punarjay Chakravarty²¹,
Ali Diba²¹,
Toon Goedemé²¹,
Tom Roussel²¹,
Jeroen Zegers²¹,
Tinne Tuytelaars²¹,
Luc Van Eycken²¹,
Luc Van Gool²¹,
Hugo Van Hamme²¹ &
…
Joost Vennekens²¹

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10704))

Included in the following conference series:

International Conference on Multimedia Modeling

3191 Accesses
9 Citations

Abstract

In this paper, we demonstrate a system that automates the process of recording video lectures in classrooms. Through special hardware (lecturer and audience facing cameras and microphone arrays), we record multiple points of view of the lecture. Person detection and tracking, along with recognition of different human actions are used to digitally zoom in on the lecturer, and alternate focus between the lecturer and the slides or the blackboard. Audio sound source localization, along with face detection and tracking, is used to detect questions from the audience, to digitally zoom in on the member of the audience asking the question and to improve the quality of the sound recording. Finally, an automatic video editing system is used to naturally switch between the different video streams and to compose a compelling end product. We demonstrate the working system in two classrooms, over two 2-h lectures, given by two lecturers.

This work is supported by the Cametron Project grant.

Excluding the corresponding author, authors are listed in alphabetical order.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
Seminar recordings: https://youtu.be/DalAafs38TU Matthew recordings: https://youtu.be/p3ZeFfj238g.
2.
https://youtu.be/4Ruzv9jAZ6E.

References

Aerts, B., Goedemé, T., Vennekens, J.: A probabilistic logic programming approach to automatic video montage. In: ECAI, pp. 234–242 (2016)
Google Scholar
Blandin, C., Ozerov, A., Vincent, E.: Multi-source TDOA estimation in reverberant audio using angular spectra and clustering. Signal Process. 92(8), 1950–1960 (2012)
Article Google Scholar
Brotherton, J.A., Abowd, G.D.: Lessons learned from eclass: assessing automated capture and access in the classroom. ACM Trans. Comput.-Hum. Interact. (TOCHI) 11(2), 121–155 (2004)
Article Google Scholar
Feichtenhofer, C., Pinz, A., Zisserman, A.: Convolutional two-stream network fusion for video action recognition. In: CVPR (2016)
Google Scholar
Felzenszwalb, P., McAllester, D., Ramanan, D.: A discriminatively trained, multiscale, deformable part model. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2008, pp. 1–8. IEEE (2008)
Google Scholar
Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Hahn, E.: Video lectures help enhance online information literacy course. Ref. Serv. Rev. 40(1), 49–60 (2012)
Article MathSciNet Google Scholar
Hulens, D., Van Beeck, K., Goedemé, T.: Fast and accurate face orientation measurement in low-resolution images on embedded hardware. In: Proceedings of the 11th Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2016), vol. 4, pp. 538–544. Scitepress (2016)
Google Scholar
Knapp, C., Carter, G.: The generalized correlation method for estimation of time delay. IEEE Trans. Acoust. Speech Signal Process. 24(4), 320–327 (1976)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Lampi, F., Kopf, S., Benz, M., Effelsberg, W.: An automatic cameraman in a lecture recording system. In: Proceedings of the International Workshop on Educational Multimedia and Multimedia Education, pp. 11–18. ACM (2007)
Google Scholar
Marchand, J.P., Pearson, M.L., Albon, S.P.: Student and faculty member perspectives on lecture capture in pharmacy education. Am. J. Pharm. Educ. 78(4), 74 (2014)
Article Google Scholar
Mavlankar, A., Agrawal, P., Pang, D., Halawa, S., Cheung, N.M., Girod, B.: An interactive region-of-interest video streaming system for online lecture viewing. In: 18th International Packet Video Workshop (PV), pp. 64–71. IEEE (2010)
Google Scholar
Mestre, X., Lagunas, M.A.: On diagonal loading for minimum variance beamformers. In: Proceedings of the 3rd IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), pp. 459–462. IEEE (2003)
Google Scholar
Pearce, D.: Aurora working group: DSR front end LVCSR evaluation AU/384/02. Ph.D. thesis, Mississippi State University (2002)
Google Scholar
Povey, D., Ghoshal, A., Boulianne, G., Burget, L., Glembek, O., Goel, N., Hannemann, M., Motlicek, P., Qian, Y., Schwarz, P., et al.: The kaldi speech recognition toolkit. In: Workshop on Automatic Speech Recognition and Understanding (ASRU), No. EPFL-CONF-192584. IEEE (2011)
Google Scholar
Rui, Y., Gupta, A., Grudin, J., He, L.: Automating lecture capture and broadcast: technology and videography. Multimed. Syst. 10(1), 3–15 (2004)
Article Google Scholar
Schulte, O.A., Wunden, T., Brunner, A.: Replay: an integrated and open solution to produce, handle, and distributeaudio-visual (lecture) recordings. In: Proceedings of the 36th Annual ACM SIGUCCS Fall Conference: Moving Mountains, Blazing Trails, pp. 195–198. ACM (2008)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: NIPS (2014)
Google Scholar
Tan, Z.H., Lindberg, B.: Low-complexity variable frame rate analysis for speech recognition and voice activity detection. IEEE J. Sel. Top. Signal Process. 4(5), 798–807 (2010)
Article Google Scholar
Tugrul, T.O.: Student perceptions of an educational technology tool: video recordings of project presentations. Procedia-Soc. Behav. Sci. 64, 133–140 (2012)
Article Google Scholar
Viola, P., Jones, M.: Rapid object detection using a boosted cascade of simple features. In: CVPR (2001)
Google Scholar
Zach, C., Pock, T., Bischof, H.: A duality based approach for realtime TV-L ¹ optical flow. In: Hamprecht, F.A., Schnörr, C., Jähne, B. (eds.) DAGM 2007. LNCS, vol. 4713, pp. 214–223. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74936-3_22
Chapter Google Scholar
Zhang, C., Rui, Y., Crawford, J., He, L.W.: An automated end-to-end lecture capture and broadcasting system. ACM Trans. Multimed. Comput. Commun. App. (TOMM) 4(1), 6 (2008)
Google Scholar
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Leuven, Kasteelpark Arenberg 10, 3001, Leuven, Belgium
Dries Hulens, Bram Aerts, Punarjay Chakravarty, Ali Diba, Toon Goedemé, Tom Roussel, Jeroen Zegers, Tinne Tuytelaars, Luc Van Eycken, Luc Van Gool, Hugo Van Hamme & Joost Vennekens

Authors

Dries Hulens
View author publications
You can also search for this author in PubMed Google Scholar
Bram Aerts
View author publications
You can also search for this author in PubMed Google Scholar
Punarjay Chakravarty
View author publications
You can also search for this author in PubMed Google Scholar
Ali Diba
View author publications
You can also search for this author in PubMed Google Scholar
Toon Goedemé
View author publications
You can also search for this author in PubMed Google Scholar
Tom Roussel
View author publications
You can also search for this author in PubMed Google Scholar
Jeroen Zegers
View author publications
You can also search for this author in PubMed Google Scholar
Tinne Tuytelaars
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Eycken
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar
Hugo Van Hamme
View author publications
You can also search for this author in PubMed Google Scholar
Joost Vennekens
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Dries Hulens .

Editor information

Editors and Affiliations

Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Klaus Schoeffmann
Chulalongkorn University, Bangkok, Thailand
Thanarat H. Chalidabhongse
City University of Hong Kong, Hong Kong, China
Chong Wah Ngo
Chulalongkorn University, Bangkok, Thailand
Supavadee Aramvith
Dublin City University, Dublin, Ireland
Noel E. O’Connor
Gwangju Institute of Science and Technology, Gwangju, Korea (Republic of)
Yo-Sung Ho
Tampere University of Technology, Tampere, Finland
Moncef Gabbouj
Rutgers University, Piscataway, New Jersey, USA
Ahmed Elgammal

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hulens, D. et al. (2018). The CAMETRON Lecture Recording System: High Quality Video Recording and Editing with Minimal Human Supervision. In: Schoeffmann, K., et al. MultiMedia Modeling. MMM 2018. Lecture Notes in Computer Science(), vol 10704. Springer, Cham. https://doi.org/10.1007/978-3-319-73603-7_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-73603-7_42
Published: 13 January 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-73602-0
Online ISBN: 978-3-319-73603-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

The CAMETRON Lecture Recording System: High Quality Video Recording and Editing with Minimal Human Supervision