On the Use of 3D CNNs for Video Saliency Modeling

Dahou Djilali, Yasser Abdelaziz; Sayah, Mohamed; McGuinness, Kevin; O’Connor, Noel E.

doi:10.1007/978-3-030-94893-1_21

Yasser Abdelaziz Dahou Djilali¹⁴,
Mohamed Sayah¹⁵,
Kevin McGuinness¹⁴ &
…
Noel E. O’Connor¹⁴

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1474))

Included in the following conference series:

International Joint Conference on Computer Vision, Imaging and Computer Graphics

827 Accesses

Abstract

There has been emerging interest recently in three dimensional (3D) convolutional neural networks (CNNs) as a powerful tool to encode spatio-temporal representations in videos, by adding a third temporal dimension to pre-existing 2D CNNs. In this chapter, we discuss the effectiveness of using 3D convolutions to capture the important motion features in the context of video saliency prediction. The method filters the spatio-temporal features across multiple adjacent frames. This cubic convolution could be effectively applied on a dense sequence of frames propagating the previous frames’ information into the current, reflecting processing mechanisms of the human visual system for better saliency prediction. We extensively evaluate the model performance compared to the state-of-the-art video saliency models on both 2D and 360\(^\circ \) videos. The architecture can efficiently learn expressive spatio-temporal representations and produce high quality video saliency maps on three large-scale 2D datasets, DHF1K, UCF-SPORTS and DAVIS. Investigations on the 360\(^\circ \) Salient360! and datasets show how the approach can generalise.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Adel Bargal, S., Zunino, A., Kim, D., Zhang, J., Murino, V., Sclaroff, S.: Excitation backprop for RNNs. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1440–1449 (2018)
Google Scholar
Amudha, J., Radha, D., Naresh, P.: Video shot detection using saliency measure. Int. J. Comput. Appl. 975, 8887 (2012). Citeseer
Google Scholar
Arun, S.: Turning visual search time on its head. Vis. Res. 74, 86–92 (2012)
Google Scholar
Bak, C., Kocak, A., Erdem, E., Erdem, A.: Spatio-temporal saliency networks for dynamic saliency prediction. IEEE Trans. Multimed. 20(7), 1688–1698 (2018)
Google Scholar
Boccignone, G.: Nonparametric Bayesian attentive video analysis. In: 2008 19th International Conference on Pattern Recognition, pp. 1–4. IEEE (2008)
Google Scholar
Boccignone, G., Ferraro, M.: Modelling gaze shift as a constrained random walk. Phys. A 331(1–2), 207–218 (2004)
Google Scholar
Borji, A.: Saliency prediction in the deep learning era: an empirical investigation. arXiv preprint arXiv:1810.03716 (2018)
Borji, A., Itti, L.: State-of-the-art in visual attention modeling. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 185–207 (2013)
Google Scholar
Borji, A., Itti, L.: CAT 2000: a large scale fixation dataset for boosting saliency research. arXiv preprint arXiv:1505.03581 (2015)
Bruce, N., Tsotsos, J.: Saliency based on information maximization. In: Advances in Neural Information Processing Systems, pp. 155–162 (2006)
Google Scholar
Bylinskii, Z., et al.: MIT saliency benchmark (2015)
Google Scholar
Bylinskii, Z., Judd, T., Oliva, A., Torralba, A., Durand, F.: What do different evaluation metrics tell us about saliency models? IEEE Trans. Pattern Anal. Mach. Intell. 41(3), 740–757 (2018)
Google Scholar
Chao, F.Y., Zhang, L., Hamidouche, W., Deforges, O.: Salgan360: visual saliency prediction on 360 degree images with generative adversarial networks. In: 2018 IEEE International Conference on Multimedia & Expo Workshops (ICMEW), pp. 01–04. IEEE (2018)
Google Scholar
Cheng, H.T., Chao, C.H., Dong, J.D., Wen, H.K., Liu, T.L., Sun, M.: Cube padding for weakly-supervised saliency prediction in 360 videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1420–1429 (2018)
Google Scholar
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: A deep multi-level network for saliency prediction. In: 2016 23rd International Conference on Pattern Recognition (ICPR), pp. 3488–3493. IEEE (2016)
Google Scholar
Dahou Djilali, Y.A., Sayah, M., McGuinness, K., O’Connor, N.E.: 3DSAL: An efficient 3D-CNN architecture for video saliency prediction. In: Proceedings of the 15th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 4: VISAPP, pp. 27–36. INSTICC, SciTePress (2020). https://doi.org/10.5220/0008875600270036
David, E.J., Gutiérrez, J., Coutrot, A., Da Silva, M.P., Callet, P.L.: A dataset of head and eye movements for 360 videos. In: Proceedings of the 9th ACM Multimedia Systems Conference, pp. 432–437 (2018)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Droste, R., Jiao, J., Noble, J.A.: Unified image and video saliency modeling. arXiv preprint arXiv:2003.05477 (2020)
Ehinger, K.A., Hidalgo-Sotelo, B., Torralba, A., Oliva, A.: Modelling search for people in 900 scenes: a combined source model of eye guidance. Vis. Cogn. 17(6–7), 945–978 (2009)
Google Scholar
Frintrop, S., Jensfelt, P.: Attentional landmarks and active gaze control for visual SLAM. IEEE Trans. Rob. 24(5), 1054–1065 (2008)
Google Scholar
Gao, D., Vasconcelos, N.: Discriminant saliency for visual recognition from cluttered scenes. In: Advances in Neural Information Processing Systems, pp. 481–488 (2005)
Google Scholar
Gao, J., Huang, Y., Yu, H.H.: Method and system for video summarization, 31 May 2016. US Patent 9,355,635
Google Scholar
Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Decorrelation and distinctiveness provide with human-like saliency. In: Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P. (eds.) ACIVS 2009. LNCS, vol. 5807, pp. 343–354. Springer, Heidelberg (2009). https://doi.org/10.1007/978-3-642-04697-1_32
Chapter Google Scholar
Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M., Dosil, R.: Saliency from hierarchical adaptation through decorrelation and variance normalization. Image Vis. Comput. 30(1), 51–64 (2012)
Google Scholar
Goferman, S., Zelnik-Manor, L., Tal, A.: Context-aware saliency detection. IEEE Trans. Pattern Anal. Mach. Intell. 34(10), 1915–1926 (2012)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Guo, C., Ma, Q., Zhang, L.: Spatio-temporal saliency detection using phase spectrum of quaternion fourier transform. In: 2008 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–8. IEEE (2008)
Google Scholar
Guo, C., Zhang, L.: A novel multiresolution spatiotemporal saliency detection model and its applications in image and video compression. IEEE Trans. Image Process. 19(1), 185–198 (2009)
MathSciNet MATH Google Scholar
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Google Scholar
Hossein Khatoonabadi, S., Vasconcelos, N., Bajic, I.V., Shan, Y.: How many bits does it take for a stimulus to be salient? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2015)
Google Scholar
Huang, X., Shen, C., Boix, X., Zhao, Q.: SALICON: reducing the semantic gap in saliency prediction by adapting deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 262–270 (2015)
Google Scholar
Itti, L.: Automatic foveation for video compression using a neurobiological model of visual attention. IEEE Trans. Image Process. 13(10), 1304–1318 (2004)
Google Scholar
Itti, L., Baldi, P.: Bayesian surprise attracts human attention. Vision. Res. 49(10), 1295–1306 (2009)
Google Scholar
Itti, L., Koch, C.: A saliency-based search mechanism for overt and covert shifts of visual attention. Vis. Res. 40(10–12), 1489–1506 (2000)
Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 11, 1254–1259 (1998)
Google Scholar
Jacobson, N., Lee, Y.L., Mahadevan, V., Vasconcelos, N., Nguyen, T.Q.: A novel approach to FRUC using discriminant saliency and frame segmentation. IEEE Trans. Image Process. 19(11), 2924–2934 (2010)
MathSciNet MATH Google Scholar
James, W.: The Principles of Psychology, vol. 1. Cosimo, Inc. (1950)
Google Scholar
Ji, S., Xu, W., Yang, M., Yu, K.: 3D convolutional neural networks for human action recognition. IEEE Trans. Pattern Anal. Mach. Intell. 35(1), 221–231 (2013)
Google Scholar
Jia, S., Bruce, N.D.: EML-NET: an expandable multi-layer network for saliency prediction. Image Vis. Comput. 95, 103887 (2020)
Google Scholar
Jiang, L., Xu, M., Liu, T., Qiao, M., Wang, Z.: DeepVS: a deep learning based video saliency prediction approach. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 602–617 (2018)
Google Scholar
Jiang, L., Xu, M., Wang, Z.: Predicting video saliency with object-to-motion CNN and two-layer convolutional LSTM. arXiv preprint arXiv:1709.06316 (2017)
Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2015
Google Scholar
Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations. MIT Technical report (2012)
Google Scholar
Judd, T., Durand, F., Torralba, A.: A benchmark of computational models of saliency to predict human fixations (2012)
Google Scholar
Judd, T., Ehinger, K., Durand, F., Torralba, A.: Learning to predict where humans look. In: 2009 IEEE 12th International Conference on Computer Vision, pp. 2106–2113. IEEE (2009)
Google Scholar
Kienzle, W., Franz, M.O., Schölkopf, B., Wichmann, F.A.: Center-surround patterns emerge as optimal predictors for human saccade targets. J. Vis. 9(5), 7 (2009)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Koch, C., Ullman, S.: Shifts in selective visual attention: towards the underlying neural circuitry. In: Vaina, L.M. (ed.) Matters of intelligence. Synthese Library (Studies in Epistemology, Logic, Methodology, and Philosophy of Science), vol. 188, pp. 115–141. Springer, Dordrecht (1987). https://doi.org/10.1007/978-94-009-3833-5_5
Chapter Google Scholar
Kootstra, G., Nederveen, A., De Boer, B.: Paying attention to symmetry. In: British Machine Vision Conference (BMVC2008), pp. 1115–1125. The British Machine Vision Association and Society for Pattern Recognition (2008)
Google Scholar
Kotseruba, I., Wloka, C., Rasouli, A., Tsotsos, J.K.: Do saliency models detect odd-one-out targets? New datasets and evaluations. arXiv preprint arXiv:2005.06583 (2020)
Kruthiventi, S.S., Ayush, K., Babu, R.V.: DeepFix: a fully convolutional neural network for predicting human eye fixations. IEEE Trans. Image Process. 26(9), 4446–4456 (2017)
MathSciNet MATH Google Scholar
Kümmerer, M., Theis, L., Bethge, M.: Deep gaze i: Boosting saliency prediction with feature maps trained on ImageNet. arXiv preprint arXiv:1411.1045 (2014)
Lai, Q., Wang, W., Sun, H., Shen, J.: Video saliency prediction using spatiotemporal residual attentive networks. IEEE Trans. Image Process. 29, 1113–1126 (2019)
MathSciNet MATH Google Scholar
Le Meur, O., Le Callet, P., Barba, D.: Predicting visual fixations on video based on low-level visual features. Vis. Res. 47(19), 2483–2498 (2007)
Google Scholar
Le Meur, O., Le Callet, P., Barba, D., Thoreau, D.: A coherent computational approach to model bottom-up visual attention. IEEE Trans. Pattern Anal. Mach. Intell. 28(5), 802–817 (2006)
Google Scholar
Leboran, V., Garcia-Diaz, A., Fdez-Vidal, X.R., Pardo, X.M.: Dynamic whitening saliency. IEEE Trans. Pattern Anal. Mach. Intell. 39(5), 893–907 (2017)
Google Scholar
Li, X., et al.: DeepSaliency: multi-task deep neural network model for salient object detection. IEEE Trans. Image Process. 25(8), 3919–3930 (2016)
MathSciNet MATH Google Scholar
Linardos, P., Mohedano, E., Nieto, J.J., O’Connor, N.E., Giro-i Nieto, X., McGuinness, K.: Simple vs complex temporal recurrences for video saliency prediction. arXiv preprint arXiv:1907.01869 (2019)
Liu, N., Han, J.: A deep spatial contextual long-term recurrent convolutional network for saliency detection. IEEE Trans. Image Process. 27(7), 3264–3274 (2018)
MathSciNet Google Scholar
Liu, N., Han, J., Zhang, D., Wen, S., Liu, T.: Predicting eye fixations using convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 362–370 (2015)
Google Scholar
Ma, Y.F., Hua, X.S., Lu, L., Zhang, H.J.: A generic framework of user attention model and its application in video summarization. IEEE Trans. Multimed. 7(5), 907–919 (2005)
Google Scholar
Mahadevan, V., Vasconcelos, N.: Saliency-based discriminant tracking. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1007–1013. IEEE (2009)
Google Scholar
Mancas, M., Ferrera, V.P., Riche, N., Taylor, J.G.: From Human Attention to Computational Attention, vol. 2. Springer, Heidelberg (2016). https://doi.org/10.1007/978-1-4939-3435-5
Book Google Scholar
Marat, S., Guironnet, M., Pellerin, D.: Video summarization using a visual attention model. In: 2007 15th European Signal Processing Conference, pp. 1784–1788. IEEE (2007)
Google Scholar
Marat, S., Phuoc, T.H., Granjon, L., Guyader, N., Pellerin, D., Guérin-Dugué, A.: Modelling spatio-temporal saliency to predict gaze direction for short videos. Int. J. Comput. Vis. 82(3), 231 (2009)
Google Scholar
Mathe, S., Sminchisescu, C.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1408–1424 (2014)
Google Scholar
Mathe, S., Sminchisescu, C.: Actions in the eye: dynamic gaze datasets and learnt saliency models for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1408–1424 (2015)
Google Scholar
Milanese, R.: Detecting salient regions in an image: from biological evidence to computer implementation. Ph. D Theses, the University of Geneva (1993)
Google Scholar
Min, K., Corso, J.J.: TASED-net: temporally-aggregating spatial encoder-decoder network for video saliency detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2394–2403 (2019)
Google Scholar
Mital, P.K., Smith, T.J., Hill, R.L., Henderson, J.M.: Clustering of gaze during dynamic scene viewing is predicted by motion. Cogn. Comput. 3(1), 5–24 (2011)
Google Scholar
Murray, N., Vanrell, M., Otazu, X., Parraga, C.A.: Saliency estimation using a non-parametric low-level vision model. In: CVPR 2011, pp. 433–440. IEEE (2011)
Google Scholar
Navalpakkam, V., Itti, L.: An integrated model of top-down and bottom-up attention for optimizing detection speed. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 2049–2056. IEEE (2006)
Google Scholar
Nwankpa, C., Ijomah, W., Gachagan, A., Marshall, S.: Activation functions: comparison of trends in practice and research for deep learning. CoRR abs/1811.03378 (2018). http://arxiv.org/abs/1811.03378
Oliva, A., Torralba, A., Castelhano, M.S., Henderson, J.M.: Top-down control of visual attention in object detection. In: Proceedings 2003 International Conference on Image Processing (Cat. No. 03CH37429), vol. 1, p. I-253. IEEE (2003)
Google Scholar
Ouerhani, N., Hügli, H.: Real-time visual attention on a massively parallel SIMD architecture. Real-Time Imaging 9(3), 189–196 (2003)
MATH Google Scholar
Pan, J., et al.: SalGAN: visual saliency prediction with generative adversarial networks. arXiv preprint arXiv:1701.01081 (2017)
Pan, J., Sayrol, E., Giro-i Nieto, X., McGuinness, K., O’Connor, N.E.: Shallow and deep convolutional networks for saliency prediction. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 598–606 (2016)
Google Scholar
Panagiotis, L., Eva, M., Monica, C., Cathal, G., Xavier, G.i.N.: Temporal saliency adaptation in egocentric videos (2018)
Google Scholar
Perazzi, F., Pont-Tuset, J., McWilliams, B., Van Gool, L., Gross, M., Sorkine-Hornung, A.: A benchmark dataset and evaluation methodology for video object segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 724–732 (2016)
Google Scholar
Prechelt, L.: Early stopping - but when? In: Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 1524, pp. 55–69. Springer, Heidelberg (1998). https://doi.org/10.1007/3-540-49430-8_3
Chapter Google Scholar
Privitera, C.M., Stark, L.W.: Algorithms for defining visual regions-of-interest: comparison with eye fixations. IEEE Trans. Pattern Anal. Mach. Intell. 22(9), 970–982 (2000)
Google Scholar
Rai, Y., Le Callet, P., Guillotel, P.: Which saliency weighting for omni directional image quality assessment? In: 2017 Ninth International Conference on Quality of Multimedia Experience (QoMEX), pp. 1–6. IEEE (2017)
Google Scholar
Ren, Z., Gao, S., Chia, L.T., Tsang, I.W.H.: Region-based saliency detection and its application in object recognition. IEEE Trans. Circ. Syst. Video Technol. 24(5), 769–779 (2013)
Google Scholar
Rimey, R.D., Brown, C.M.: Controlling eye movements with hidden Markov models. Int. J. Comput. Vis. 7(1), 47–65 (1991)
Google Scholar
Rosenholtz, R.: A simple saliency model predicts a number of motion popout phenomena. Vis. Res. 39(19), 3157–3163 (1999)
Google Scholar
Rudoy, D., Goldman, D.B., Shechtman, E., Zelnik-Manor, L.: Learning video saliency from human gaze using candidate selection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1147–1154 (2013)
Google Scholar
Salvucci, D.D.: An integrated model of eye movements and visual encoding. Cogn. Syst. Res. 1(4), 201–220 (2001)
Google Scholar
Saslow, M.: Effects of components of displacement-step stimuli upon latency for saccadic eye movement. Josa 57(8), 1024–1029 (1967)
Google Scholar
Seo, H.J., Milanfar, P.: Static and space-time visual saliency detection by self-resemblance. J. Vis. 9(12), 15 (2009)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Sohlberg, M.M., Mateer, C.A.: Introduction to Cognitive Rehabilitation: Theory and Practice. Guilford Press (1989)
Google Scholar
Sprague, N., Ballard, D.: Eye movements for reward maximization. In: Advances in Neural Information Processing Systems, pp. 1467–1474 (2004)
Google Scholar
Suzuki, T., Yamanaka, T.: Saliency map estimation for omni-directional image considering prior distributions. In: 2018 IEEE International Conference on Systems, Man, and Cybernetics (SMC), pp. 2079–2084. IEEE (2018)
Google Scholar
Torralba, A.: Modeling global scene factors in attention. JOSA A 20(7), 1407–1418 (2003)
Google Scholar
Tran, D., Bourdev, L., Fergus, R., Torresani, L., Paluri, M.: Learning spatiotemporal features with 3D convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4489–4497 (2015)
Google Scholar
Treisman, A.M., Gelade, G.: A feature-integration theory of attention. Cogn. Psychol. 12(1), 97–136 (1980)
Google Scholar
Tsotsos, J.K., Culhane, S.M., Wai, W.Y.K., Lai, Y., Davis, N., Nuflo, F.: Modeling visual attention via selective tuning. Artif. Intell. 78(1–2), 507–545 (1995)
Google Scholar
Vig, E., Dorr, M., Cox, D.: Large-scale optimization of hierarchical features for saliency prediction in natural images. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2798–2805 (2014)
Google Scholar
Wang, W., Shen, J., Guo, F., Cheng, M.M., Borji, A.: Revisiting video saliency: a large-scale benchmark and a new model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4894–4903 (2018)
Google Scholar
Wang, W., Shen, J., Xie, J., Cheng, M.M., Ling, H., Borji, A.: Revisiting video saliency prediction in the deep learning era. IEEE Trans. Pattern Anal. Mach. Intell. (2019)
Google Scholar
Wang, W., Shen, J., Yang, R., Porikli, F.: Saliency-aware video object segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 40(1), 20–33 (2017)
Google Scholar
Wolfe, J.M.: Guided search 2.0 a revised model of visual search. Psychon. Bull. Rev. 1(2), 202–238 (1994)
Google Scholar
Wu, X., Yuen, P.C., Liu, C., Huang, J.: Shot boundary detection: an information saliency approach. In: 2008 Congress on Image and Signal Processing, vol. 2, pp. 808–812. IEEE (2008)
Google Scholar
Xu, M., Li, C., Zhang, S., Le Callet, P.: State-of-the-art in 360 video/image processing: perception, assessment and compression. IEEE J. Sel. Top. Signal Process. 14(1), 5–26 (2020)
Google Scholar
Xu, M., Song, Y., Wang, J., Qiao, M., Huo, L., Wang, Z.: Predicting head movement in panoramic video: a deep reinforcement learning approach. IEEE Trans. Pattern Anal. Mach. Intell. 41(11), 2693–2708 (2018)
Google Scholar
Xu, Y., et al.: Gaze prediction in dynamic 360 immersive videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5333–5342 (2018)
Google Scholar
Zhang, J., Sclaroff, S.: Saliency detection: a boolean map approach. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 153–160 (2013)
Google Scholar
Zhang, K., Chen, Z.: Video saliency prediction based on spatial-temporal two-stream network. IEEE Trans. Circ. Syst. Video Technol. 29(12), 3544–3557 (2018)
Google Scholar
Zhang, L., Tong, M.H., Marks, T.K., Shan, H., Cottrell, G.W.: SUN: a Bayesian framework for saliency using natural statistics. J. Vis. 8(7), 32 (2008)
Google Scholar
Zhang, Z., Xu, Y., Yu, J., Gao, S.: Saliency detection in 360 videos. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 488–503 (2018)
Google Scholar

Download references

Author information

Authors and Affiliations

Insight Center for Data Analytics, Dublin City University, Dublin 9, Ireland
Yasser Abdelaziz Dahou Djilali, Kevin McGuinness & Noel E. O’Connor
Laboratory LITIO, FSEA Faculty, University Oran1, Oran, Algeria
Mohamed Sayah

Authors

Yasser Abdelaziz Dahou Djilali
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed Sayah
View author publications
You can also search for this author in PubMed Google Scholar
Kevin McGuinness
View author publications
You can also search for this author in PubMed Google Scholar
Noel E. O’Connor
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yasser Abdelaziz Dahou Djilali .

Editor information

Editors and Affiliations

IRISA, University of Rennes 1, Rennes, France
Kadi Bouatouch
Universidade do Porto, Porto, Portugal
A. Augusto de Sousa
University of Genova, Genova, Italy
Manuela Chessa
Mines ParisTech, Paris, France
Alexis Paljic
Linnaeus University, Växjö, Sweden
Andreas Kerren
French Civil Aviation University (ENAC), Toulouse, France
Christophe Hurter
Università di Catania, Catania, Italy
Giovanni Maria Farinella
Universitat de Barcelona, Barcelona, Spain
Petia Radeva
Escola Superior de Tecnologia de Setúbal, Setúbal, Portugal
Jose Braz

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Dahou Djilali, Y.A., Sayah, M., McGuinness, K., O’Connor, N.E. (2022). On the Use of 3D CNNs for Video Saliency Modeling. In: Bouatouch, K., et al. Computer Vision, Imaging and Computer Graphics Theory and Applications. VISIGRAPP 2020. Communications in Computer and Information Science, vol 1474. Springer, Cham. https://doi.org/10.1007/978-3-030-94893-1_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-94893-1_21
Published: 22 January 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-94892-4
Online ISBN: 978-3-030-94893-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On the Use of 3D CNNs for Video Saliency Modeling