Abstract
With the rapid development of society, video surveillance has progressively expanded into different areas of life, such as transportation, security inspection, banks. There are a large number of replaced and newly deployed cameras in fields such as safe cities, smart campuses and smart buildings, which leads to a huge amount of video data, slow retrieval speed in video examining, and low efficiency in understanding complete picture of videos. In this paper, we propose SurVizor, a visual analysis system to understand the key content of surveillance videos. We integrate multiple image features and employ time series analysis methods to explore key temporal patterns in the feature. We integrate multiple visualization views from three levels of video, feature, and frame to promote exploration, analysis and understanding of video content. We evaluate the proposed system through a case study based on real-world surveillance videos from multi-camera and a user study. The results demonstrate the usability and effectiveness of our system in analyzing and understanding the key content of surveillance videos.
Graphic abstract
Similar content being viewed by others
References
Alabdulatif A, Khalil I, Forkan ARM, Atiquzzaman M (2018) Real-time secure health surveillance for smarter health communities. IEEE Commun Mag 57(1):122–129
Alameda-Pineda X, Staiano J, Subramanian R, Batrinca L, Ricci E, Lepri B, Lanz O, Sebe N (2015) Salsa: a novel dataset for multimodal group behavior analysis. IEEE Trans Pattern Anal Mach Intell 38(8):1707–1720
Alshammari A, Rawat DB (2019) Intelligent multi-camera video surveillance system for smart city applications. In: Proceedings of the IEEE annual computing and communication workshop and conference, pp 0317–0323
Blondel VD, Guillaume JL, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008
Bylinskii Z, Isola P, Bainbridge C, Torralba A, Oliva A (2015) Intrinsic and extrinsic effects on image memorability. Vision Res 116:165–178
Chan GYY, Nonato LG, Chu A, Raghavan P, Aluru V, Silva CT (2019) Motion browser: visualizing and understanding complex upper limb movement under obstetrical brachial plexus injuries. IEEE Trans Visual Comput Graph 26(1):981–990
Chandola V, Banerjee A, Kumar V (2009) Anomaly detection: a survey. ACM Comput Surv 41(3):1–58
Cheng Z, Yang Y, Wang W, Hu W, Zhuang Y, Song G (2020) Time2graph: revisiting time series modeling with dynamic shapelets. Proc AAAI Conf Artif Intell 34:3617–3624
Chung FL, Fu TC, Luk R, Ng V, et al (2001) Flexible time series pattern matching based on perceptually important points, pp 1–7
Cui Z, Chen W, Chen Y (2016) Multi-scale convolutional neural networks for time series classification. arXiv preprint arXiv:1603.06995
Douglas DH, Peucker TK (1973) Algorithms for the reduction of the number of points required to represent a digitized line or its caricature. Cartograph Int J Geograph Inform Geovisual 10(2):112–122
Fajtl J, Argyriou V, Monekosso D, Remagnino P (2018) Amnet: memorability estimation with attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6363–6372
Gygli M, Grabner H, Riemenschneider H, Van Gool L (2014) Creating summaries from user videos. In: Proceedings of the European conference on computer vision, pp 505–520
Heer J, Kong N, Agrawala M (2009) Sizing the horizon: the effects of chart size and layering on the graphical perception of time series visualizations. In: Proceedings of the special interest group on computer-human interaction conference on human factors in computing systems, pp 1303–1312
Hu T, Li Z, Su W, Mu X, Tang J (2017) Unsupervised video summaries using multiple features and image quality. In: Proceedings of the IEEE international conference on multimedia big data, pp 117–120
Lee C, Kim Y, Jin SM, Kim D, Maciejewski R, Ebert D, Ko S (2019) A visual analytics system for exploring, monitoring, and forecasting road traffic congestion. IEEE Trans Visual Comput Graph 26(11):3133–3146
Liao TW (2005) Clustering of time series data: a survey. Pattern Recogn 38(11):1857–1874
Liu L, Ouyang W, Wang X, Fieguth P, Chen J, Liu X, Pietikäinen M (2020) Deep learning for generic object detection: a survey. Int J Comput Vision 128(2):261–318
Liu L, Wang Z (2016) Encoding temporal markov dynamics in graph for visualizing and mining time series. arXiv preprint arXiv:1610.07273
Liu M, Shi J, Li Z, Li C, Zhu J, Liu S (2017) Towards better analysis of deep convolutional neural networks. IEEE Trans Visual Comput Graph 23(1):91–100
Liu W, Luo W, Lian D, Gao S (2018) Future frame prediction for anomaly detection: a new baseline. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6536–6545
Sun G, Liang R, Qu H, Wu Y (2017) Embedding spatio-temporal information into maps by route-zooming. IEEE Trans Visual Comput Graph 23(5):1506–1519. https://doi.org/10.1109/TVCG.2016.2535234
Sun G, Wu H, Zhu L, Xu C, Liang H, Xu B, Liang R (2021) VSumVis: interactive visual understanding and diagnosis of video summarization model. ACM Trans Intell Syst Technol 12(4):1–28. https://doi.org/10.1145/3458928
Sun GD, Wu YC, Liang RH, Liu SX (2013) A survey of visual analytics techniques and applications: state-of-the-art research and future challenges. J Comput Sci Technol 28(5):852–867
Talebi H, Milanfar P (2018) Nima: neural image assessment. IEEE Trans Image Process 27(8):3998–4011
Wang J, Wu J, Cao A, Zhou Z, Zhang H, Wu Y (2021) Tac-miner: visual tactic mining for multiple table tennis matches. IEEE Trans Vis Comput Graph 27(6):2770–2782. https://doi.org/10.1109/TVCG.2021.3074576
Wei H, Ni B, Yan Y, Yu H, Yang X, Yao C (2018) Video summarization via semantic attended networks. In: Proceedings of the AAAI conference on artificial intelligence, vol 32
Weng D, Zheng C, Deng Z, Ma M, Bao J, Zheng Y, Xu M, Wu Y (2021) Towards better bus networks: a visual analytics approach. IEEE Trans Vis Comput Graph 27(2):817–827. https://doi.org/10.1109/TVCG.2020.3030458
Wu A, Qu H (2018) Multimodal analysis of video collections: visual exploration of presentation techniques in ted talks. IEEE Trans Visual Comput Graph 26(7):2429–2442
Xu Y, Liu X, Liu Y, Zhu SC (2016) Multi-view people tracking via hierarchical trajectory composition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4256–4265
Ye S, Chen Z, Chu X, Wang Y, Fu S, Shen L, Zhou K, Wu Y (2021) Shuttlespace: exploring and analyzing movement trajectory in immersive visualization. IEEE Trans Vis Comput Graph 27(2):860–869. https://doi.org/10.1109/TVCG.2020.3030392
Yuan J, Chen C, Yang W, Liu M, Xia J, Liu S (2021) A survey of visual analytics techniques for machine learning. Comput Vis Media 7(1):3–36. https://doi.org/10.1007/s41095-020-0191-7
Zeng H, Shu X, Wang Y, Wang Y, Zhang L, Pong TC, Qu H (2020) Emotioncues: emotion-oriented visual summarization of classroom videos. IEEE Trans Visual Comput Graph
Zeng H, Wang X, Wu A, Wang Y, Li Q, Endert A, Qu H (2019) Emoco: visual analysis of emotion coherence in presentation videos. IEEE Trans Visual Comput Graph 26(1):927–937
Acknowledgements
This work is partly supported by National Natural Science Foundation of China (62036009), National Natural Science Foundation of China (61972356), Fundamental Research Funds for the Provincial Universities of Zhejiang (RF-A2020001).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Sun, G., Li, T. & Liang, R. SurVizor: visualizing and understanding the key content of surveillance videos. J Vis 25, 635–651 (2022). https://doi.org/10.1007/s12650-021-00803-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12650-021-00803-w