Skip to main content

Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection

  • Conference paper
  • First Online:
Artificial Intelligence and Security (ICAIS 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13338))

Included in the following conference series:

  • 1562 Accesses

Abstract

Video anomaly detection refers to the identification of abnormal behaviors that do not conform to normal patterns. Reconstruction of video frames based on auto-encoder is the current mainstream video anomaly detection method. If frames have higher reconstruction error than the threshold, these frames will be treated as the anomalous frames. However, auto-encoders lack attention to global information and channel dependence. The attention mechanism enables the neural network to accurately focus on input-related elements and becomes an important part of the neural network. In order to focus the feature of both channel and spatial dimensions, we propose dual attention mechanisms based auto-encoder (DAMAE) for video anomaly detection. After each down-sampling, the feature map is operated by two kinds of attention processing. The feature map is divided into specific groups. Every individual group can autonomously enhance its learnt expression and suppress possible noise. By fusing channel attention and spatial attention, DAMAE is able to capture the pixel-level pairwise relationship and channel dependence. Compared with traditional auto-encoder in the process of each up-sampling, the feature with channel attention and spatial attention can reconstruct the normal pattern of the video better. Experimental results show that our method is superior to other advanced methods, which proves the effectiveness of our method.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Cong, Y., Yuan, J., Liu, J.: Sparse reconstruction cost for abnormal event detection. In: The 24th IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2011, Colordo Springs, CO, USA, 20–25 June 2011, pp. 3449–3456. IEEE Computer Society (2011)

    Google Scholar 

  2. Lu, C., Shi, J., Jia, J.: Abnormal event detection at 150 FPS in MATLAB. In: ICCV, pp. 2720–2727. IEEE Computer Society (2013)

    Google Scholar 

  3. Zhao, B., Fei-Fei, L., Xing, E.P.: Online detection of unusual events in videos via dynamic sparse coding. In: CVPR, pp. 3313–3320. IEEE Computer Society (2011)

    Google Scholar 

  4. Tung, F., Zelek, J.S., Clausi, D.A.: Goal-based trajectory analysis for unusual behaviour detection in intelligent surveillance. Image Vis. Comput. 29(4), 230–240 (2011)

    Article  Google Scholar 

  5. Shi, Y., Tian, Y., Wang, Y., Huang, T.: Sequential deep trajectory descriptor for action recognition with three-stream CNN. IEEE Trans. Multim. 19(7), 1510–1520 (2017)

    Article  Google Scholar 

  6. Wang, X., Tieu, K., Grimson, E.: Learning semantic scene models by trajectory analysis. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3953, pp. 110–123. Springer, Heidelberg (2006). https://doi.org/10.1007/11744078_9

    Chapter  Google Scholar 

  7. Xu, Z., Zeng, X., Ji, G., Sheng, B.: Improved anomaly detection in surveillance videos with multiple probabilistic models inference. Intell. Autom. Soft Comput. 31(3), 1703–1717 (2022)

    Article  Google Scholar 

  8. Chen, W., Xie, D., Zhang, Y., Pu, S.: All you need is a few shifts: Designing efficient convolutional neural networks for image classification. In: CVPR, pp. 7241–7250. Computer Vision Foundation/IEEE (2019)

    Google Scholar 

  9. Xue, Z.: Semi-supervised convolutional generative adversarial network for hyperspectral image classification. IET Image Process. 14(4), 709–719 (2020)

    Article  Google Scholar 

  10. Crawford, E., Pineau, J.: Spatially invariant unsupervised object detection with convolutional neural networks. In: AAAI, pp. 3412–3420. AAAI Press (2019)

    Google Scholar 

  11. Liu, Z., Shi, S., Duan, Q., Zhang, W., Zhao, P.: Salient object detection for RGB-D image by single stream recurrent convolution neural network. Neurocomputing 363, 46–57 (2019)

    Article  Google Scholar 

  12. Hasan, M., Choi, J., Neumann, J., Roy-Chowdhury, A.K., Davis, L.S.: Learning temporal regularity in video sequences. In: CVPR, pp. 733–742. IEEE Computer Society (2016)

    Google Scholar 

  13. Liu, W., Luo, W., Lian, D., Gao, S.: Future frame prediction for anomaly detection - A new baseline. In: CVPR, pp. 6536–6545. Computer Vision Foundation/IEEE Computer Society (2018)

    Google Scholar 

  14. Luo, W., Liu, W., Gao, S.: Remembering history with convolutional LSTM for anomaly detection. In: ICME, pp. 439–444. IEEE Computer Society (2017)

    Google Scholar 

  15. Sultani, W., Chen, C., Shah, M.: Real-world anomaly detection in surveillance videos. In: CVPR, pp. 6479–6488. Computer Vision Foundation /IEEE Computer Society (2018)

    Google Scholar 

  16. Xiang, X., Ren, W., Qiu, Y., Zhang, K., Lv, N.: Multi-object tracking method based on eficient channel attention and switchable atrous convolution. Neural Process. Lett. 53(4), 2747–2763 (2021)

    Article  Google Scholar 

  17. Li, P., Chen, P., Xie, Y., Zhang, D.: Bi-modal learning with channel-wise attention for multi-label image classification. IEEE Access 8, 9965–9977 (2020)

    Article  Google Scholar 

  18. Hou, G., Qin, J., Xiang, X., Tan, Y., Xiong, N.N.: Af-net: A medical image segmentation network based on attention mechanism and feature fusion. Comput. Mater. Continua 69(2), 1877–1891 (2021)

    Article  Google Scholar 

  19. Li, Y., Wang, X.: Person re-identification based on joint loss and multiple attention. Intell. Autom. Soft Comput. 30(2), 563–573 (2021)

    Article  MathSciNet  Google Scholar 

  20. Prabhu, K., SathishKumar, S., Sivachitra, M., Dineshkumar, S., Sathiyabama, P.: Facial expression recognition using enhanced convolution neural network with attention mechanism. Comput. Syst. Sci. Eng. 41(1), 415–426 (2022)

    Article  Google Scholar 

  21. Fan, D., Wang, W., Cheng, M., Shen, J.: Shifting more attention to video salient object detection. In: CVPR, pp. 8554–8564. Computer Vision Foundation/IEEE (2019)

    Google Scholar 

  22. Nasaruddin, N., Muchtar, K., Afdhal, A., Dwiyantoro, A.P.J.: Deep anomaly detection through visual attention in surveillance videos. J. Big Data 7(1), 87 (2020)

    Article  Google Scholar 

  23. Wang, C., Yao, Y., Yao, H.: Video anomaly detection method based on future frame prediction and attention mechanism. In: CCWC, pp. 405–407. IEEE (2021)

    Google Scholar 

  24. Zhang, W., Wang, G., Huang, M., Wang, H., Wen, S.: Generative adversarial networks for abnormal event detection in videos based on self-attention mechanism. IEEE Access 9, 124847–124860 (2021)

    Article  Google Scholar 

  25. Fu, J., Liu, J., Tian, H., Li, Y., Bao, Y., Fang, Z., Lu, H.: Dual attention network for scene segmentation. In: CVPR, pp. 3146–3154. Computer Vision Foundation/IEEE (2019)

    Google Scholar 

  26. Deng, L., Wang, X., Jiang, F., Doss, R.: Eeg-based emotion recognition via capsule nework with channel-wise attention and lstm models. CCF Trans. Pervasive Comput. Interact. 3(4), 425–435 (2021)

    Article  Google Scholar 

  27. Cao, Y., Xu, J., Lin, S., Wei, F., Hu, H.: Gcnet: Non-local networks meet squeeze-excitation networks and beyond. In: ICCV Workshops, pp. 1971–1980. IEEE (2019)

    Google Scholar 

  28. Ma, B., Wang, X., Zhang, H., Li, F., Dan, J.: CBAM-GAN: generative adversarial networks based on convolutional block attention module. In: Sun, X., Pan, Z., Bertino, E. (eds.) ICAIS 2019. LNCS, vol. 11632, pp. 227–236. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24274-9_20

    Chapter  Google Scholar 

  29. Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: Eca-net: Efficient channel attention for deep convolutional neural networks. In: CVPR, pp. 11531–11539. Computer Vision Foundation/IEEE (2020)

    Google Scholar 

  30. Zhang, Q., Yang, Y.: Sa-net: Shuffle attention for deep convolutional neural networks. In: ICASSP, pp. 2235–2239. IEEE (2021)

    Google Scholar 

  31. Ma, N., Zhang, X., Zheng, H.-T., Sun, J.: ShuffleNet V2: practical guidelines for efficient CNN architecture design. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) Computer Vision – ECCV 2018. LNCS, vol. 11218, pp. 122–138. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01264-9_8

    Chapter  Google Scholar 

  32. Mahadevan, V., Li, W., Bhalodia, V., Vasconcelos, N.: Anomaly detection in crowded scenes. In: CVPR, pp. 1975–1981. IEEE Computer Society (2010)

    Google Scholar 

  33. Mehran, R., Oyama, A., Shah, M.: Abnormal crowd behavior detection using social force model. In: CVPR, pp. 935–942. IEEE Computer Society (2009)

    Google Scholar 

  34. Kim, J., Grauman, K.: Observe locally, infer globally: A space-time MRF for detecting abnormal activities with incremental updates. In: CVPR, pp. 2921–2928. IEEE Computer Society (2009)

    Google Scholar 

  35. Chong, Y.S., Tay, Y.H.: Abnormal event detection in videos using spatiotemporal autoencoder. In: Cong, F., Leung, A., Wei, Q. (eds.) ISNN 2017. LNCS, vol. 10262, pp. 189–196. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59081-3_23

    Chapter  Google Scholar 

  36. Fan, Y., Wen, G., Li, D., Qiu, S., Levine, M.D., Xiao, F.: Video anomaly detection and localization via gaussian mixture fully convolutional variational autoencoder. Comput. Vis. Image Underst. 195, 102920 (2020)

    Article  Google Scholar 

  37. Gong, D., Liu, L., Le, V., Saha, B., Mansour, M.R., Venkatesh, S., van den Hengel, A.: Memorizing normality to detect anomaly: Memory-augmented deep autoencoder for unsupervised anomaly detection. In: ICCV, pp. 1705–1714. IEEE (2019)

    Google Scholar 

  38. Luo, W., et al.: Video anomaly detection with sparse coding inspired deep neural networks. IEEE Trans. Pattern Anal. Mach. Intell. 43(3), 1070–1084 (2021)

    Article  Google Scholar 

Download references

Funding

This work was supported by the National Science Foundation of China under Grant No. 41971343.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Genlin Ji .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gu, J., Zeng, J., Ji, G. (2022). Dual Attention Mechanisms Based Auto-Encoder for Video Anomaly Detection. In: Sun, X., Zhang, X., Xia, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2022. Lecture Notes in Computer Science, vol 13338. Springer, Cham. https://doi.org/10.1007/978-3-031-06794-5_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-06794-5_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-06793-8

  • Online ISBN: 978-3-031-06794-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics