Skip to main content
Log in

MPAT: multi-path attention temporal method for video anomaly detection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Video anomaly detection is a recent focus of computer vision research thanks to the rarity and uncertainty of anomalous events. However, most existing research works are limited to learning the apparent and motion information of specific objects, ignoring the effect of temporal information. In this paper, multi-path attentional temporal method is proposed to detect whether videos contain anomalies. Specifically, the activity of adjacent units is regulated by a novel intra-layer Recurrent Residual Convolution Unit (RRCU) with temporal function, and different time steps are set to enhance the model’s ability to integrate contextual information. Furthermore, considering the information loss caused by image compression in the encoding stage, Skip Attention Gates (SAG) are used to focus on specific objects of different shapes and sizes and aggregate information from multiple feature scales. As an end-to-end learning framework, the proposed model can extract more discriminative spatio-temporal features, and the experimental results on three datasets demonstrate the effectiveness and generalization of the proposed approach.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11
Fig. 12

Similar content being viewed by others

Data availability

We provide original and editable data appearing in the submitted article, including figures, tables and experimental results.

References

  1. Aggarwal AK (2020) Enhancement of gps position accuracy using machine vision and deep learning techniques. J Comput Sci 16(5):651–659

    Article  Google Scholar 

  2. Aggarwal AK (2015) On the use of artificial intelligence techniques in transportation systems. Int J Soft Comput Eng 5(5):21–24

  3. Alafif T, Alzahrani B, Cao Y, Alotaibi R, Barnawi A, Chen M (2021) Generative adversarial network based abnormal behavior detection in massive crowd videos: a hajj case study. J Ambient Intell Humaniz Comput 13:4077–4088

    Article  Google Scholar 

  4. Ashwani K, Atsuhiko B, Shintaro O, Takeshi O, Katsushi I (2013) Global coordinate adjustment of the 3d survey models under unstable gps conditio. Seisan Kenkyu 65(2):91–95

    Google Scholar 

  5. Ashwani K, Yoshihiro S, Takeshi O, Shintaro O, Katsushi I (2014) Improving gps position accuracy by identification of reflected gps signals using range data for modeling of urban structures. Seisan Kenkyu 66(2):101–107

  6. Chauhan S, Singh M, Ashwani K (2021) Data science and data analytics: artificial intelligence and machine learning integrated based approach. Chapman and Hall/CRC, Boca Raton

  7. Chen H, Shen J, Wang L, Song J (2017) Leveraging stacked denoising autoencoder in prediction of pathogen-host protein-protein interactions. Processing of the 2017 IEEE international congress on big data: 368-375

  8. Cong Y, Yuan J, Liu J (2011) Sparse reconstruction cost for abnormal event detection. Processing of the IEEE Conference on Computer Vision and Pattern Recognition: 3449–3456

  9. Cosar S, Donatiello G, Bogorny V, Garate C, Alvares LO, Bremond F (2017) Toward abnormal trajectory and event detection in video surveillance. IEEE Trans Circuits Syst Video Technol 27(3):683–695

    Article  Google Scholar 

  10. Dalal N, Triggs B, Schmid C (2006) Human detection using oriented histograms of flow and appearance. Processing of the 9th European conference on computer vision: 428-441

  11. Fanta H, Shao Z, Ma L (2020) SiTGRU: single-tunnelled gated recurrent unit for abnormality detection. Inf Sci 524:15–32

    Article  Google Scholar 

  12. Giorno AD, Bagnell JA, Hebert M (2016) A discriminative framework for anomaly detection in large videos. Processing of the Eur Conf Comput Vis: 334–349

  13. Han QL, Wang HF, Yang L, Wu M, Kou JQ, Du QS, Li NF (2020) Real-time adversarial GAN-based abnormal crowd behavior detection. J Real-Time Image Proc 17(6):2153–2162

    Article  Google Scholar 

  14. Hasan M, Choi J, Neumann J, Roy-Chowdhury AK, Davis LS (2016) Learning temporal regularity in video sequences. Processing of the IEEE Conference on Computer Vision and Pattern Recognition: 733–742

  15. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. Processing of the IEEE Conference on Computer Vision and Pattern Recognition: 770–778

  16. Ionescu RT, Smeureanu S, Alexe B, Popescu M (2017) Unmasking the abnormal events in video. Processing of the IEEE Int Conf Comput Vis: 2914–2922.

  17. Kim J, Grauman K (2009) Observe locally, infer globally: a space-time MRF for detecting abnormal activities with incremental updates. Processing of the IEEE Conference on Computer Vision and Pattern Recognition: 2921–2928

  18. Kingma D, Ba J (2015) Adam: a method for stochastic optimization. Processing of the International Conference on Learning Representations

  19. Kratz L, Nishino K (2009) Anomaly detection in extremely crowded scenes using spatio-temporal motion pattern models. Processing of the IEEE conference on computer vision and pattern recognition: 1446-1453

  20. Li W, Mahadevan V, Vasconcelos N (2014) Anomaly detection and localization in crowded scenes. IEEE Trans Pattern Anal Mach Intell 36(1):18–32

    Article  Google Scholar 

  21. Li RR, Liu WJ, Yang L, Sun SH, Hu W, Zhang F, Li W (2018) DeepUNet: a deep fully convolutional network for pixel-level sea-land segmentation. IEEE J Sel Top Appl Earth Obs Remote Sens 11(11):3954–3962

    Article  Google Scholar 

  22. Li NJ, Chang FL, Liu CS (2021) Spatial-temporal Cascade autoencoder for video anomaly detection in crowded scenes. IEEE Trans Multimedia 203:203–215

    Article  Google Scholar 

  23. Liu W, Luo WX, Lian DZ, Gao SH (2018) Future frame prediction for anomaly detection -- a new baseline. Processing of the IEEE Conf Comput Vis Pattern Recognit: 6536–6545

  24. Lu C, Shi J, Jia J (2013) Abnormal event detection at 150 fps in MATLAB. Processing of the IEEE International Conference on Computer Vision, Sydney: 2720–2727

  25. Luo W, Liu W, Gao S (2017) A revisit of sparse coding based anomaly detection in stacked RNN framework. Processing of the IEEE international conference on computer vision: 341-349

  26. Luo W, Liu W, Gao S (2017) Remembering history with convolutional LSTM for anomaly detection. Processing of the IEEE international conference on multimedia and expo: 439-444

  27. Luo W, Liu W, Lian D, Tang J, Duan L, Peng X, Gao S (2021) Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans Pattern Anal Mach Intell 43(3):1070–1084

    Article  Google Scholar 

  28. Ma Z, Machado JJM, Tavares JMRS (2021) Weakly supervised video anomaly detection based on 3d convolution and lstm. Sensors. 21(22):7508

    Article  Google Scholar 

  29. Mahadevan V, Li W, Bhalodia V, Vasconcelos N (2010) anomaly detection in crowded scenes. Processing of the IEEE Conf Comput Vis Pattern Recognit: 1975-1981

  30. Mathieu M, Couprie C, LeCun Y (2015) Deep multi-scale video prediction beyond mean square error. Processing of the International Conference on Learning Representations

  31. Medel JR, Savakis A (2016) Anomaly detection in video using predictive convolutional long short-term memory networks. Processing of the IEEE Conference on Computer Vision and Pattern Recognition: 1–27

  32. Mehran R, Oyama A, Shah M (2009) Abnormal crowd behavior detection using social force model. Processing of the IEEE Conf Comput Vis Pattern Recognit: 935–942

  33. Navneet N, Triggs B (2005) Histograms of oriented gradients for human detection. Processing of the IEEE conference on computer vision and pattern recognition: 886-893

  34. Park H, Noh J, Ham B (2020) Learning memory-guided normality for anomaly detection. Processing of the IEEE Conf Comput Vis pattern Recognit: 14360-14369

  35. Paszke A, Gross S, Chintala S, Chanan G, Yang E, Devito Z, Lin Z, Desmaison A, Antiga L, Lerer A (2017) Automatic differentiation in PyTorch. Processing of the Conference and Workshop on Neural Information Processing Systems

  36. Tay N, Connie T, Ong TS, Teh PS (2016) Abnormal behavior recognition using CNN-LSTM with attention mechanism. Proceedings of the 1st international conference on electrical, control and instrumentation engineering: 1-5

  37. Wang DL, Wang SY (2021) Abnormal event detection algorithm based on dual attention future frame prediction and gap fusion discrimination. Journal of Electronic Imaging 30(2):023009

    Article  Google Scholar 

  38. Xu D, Yan Y, Ricci E, Sebe N (2017) Detecting anomalous events in videos by learning deep representations of appearance and motion. Comput Vis Image Under 156:117–127

    Article  Google Scholar 

  39. Yan SY, Smith JS, Lu WJ, Zhang BL (2020) Abnormal event detection from videos using a two-stream recurrent variational autoencode. IEEE Trans Cognit Devel Syst 12(1):30–42

    Article  Google Scholar 

  40. Ye M, Peng X, Gan W, Wu W, Qiao Y (2019) Anopcn: video anomaly detection via deep predictive coding network. Processing of the 27th ACM Int Conf multimedia: 1805-1813

  41. Yong SC, Yong HT (2017) Abnormal event detection in videos using spatiotemporal autoencoder. Processing of the international symposium on neural networks: 189-196

  42. Zhang Y, Lu HC, Zhang LH, Ruan X (2016) Combining motion and appearance cues for anomaly detection. Pattern Recogn 51:443–452

    Article  Google Scholar 

  43. Zhang Y, Nie XS, He RD, Chen M, Yin YL (2021) Normality learning in multispace for video anomaly detection. IEEE Trans Circuits Syst Video Technol 31(9):3694–3706

    Article  Google Scholar 

  44. Zhao B, Fei-Fei L, Xing EP (2011) Online detection of unusual events in videos via dynamic sparse coding. Processing of the IEEE Conference on Computer Vision and Pattern Recognition: 3313–3320

  45. Zhao Y, Deng B, Shen C, Liu Y, Lu H, Hua XS (2017) Spatio-temporal autoencoder for video anomaly detection. Proceedings of the 2017 ACM on multimedia conference: 1933-1941

  46. Zhao Y, Deng B, Shen C, Liu Y, Lu H, Hua XS (2017) Spatiotemporal AutoEncoder for video anomaly detection. Processing of the 25th ACM multimedia Conf: 1933-1941

  47. Zhou XG, Zhang LQ (2015) Abnormal event detection using recurrent neural network. Processing of the international conference on computer sciences and applications: 222-226

  48. Zhou S, Wei S, Dan Z, Zhang Z (2015) Unusual event detection in crowded scenes by trajectory analysis. Processing of the IEEE international conference on acoustics: 1300-1304

Download references

Code availability

We are pleased to share code that is used in work submitted for publication.

Funding

This work is supported in part by National Natural Science Foundation of China under Grant 61871241, Grant 61971245 and Grant 61976120, in part by Nanjing University State Key Lab. for Novel Software Technology under Grant KFKT2019B15, in part by Nantong Science and Technology Program JC2021131 and in part by Postgraduate Research and Practice Innovation Program of Jiangsu Province KYCX21_3084 and KYCX22_3340.

Author information

Authors and Affiliations

Authors

Contributions

All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by Hongjun Li, Xiaohu Sun, Chaobo Li, Xulin Shen, Jinyi Chen, Junjie Chen and Zhengguang Xie. The first draft of the manuscript was written by Hongjun Li and Xiaohu Sun, all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.

Corresponding author

Correspondence to Hongjun Li.

Ethics declarations

Conflicts of interest/competing interests

None.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Li, H., Sun, X., Li, C. et al. MPAT: multi-path attention temporal method for video anomaly detection. Multimed Tools Appl 82, 12557–12575 (2023). https://doi.org/10.1007/s11042-022-13834-8

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-022-13834-8

Keywords

Navigation