Saliency Prediction with Relation-Aware Global Attention Module

Cao, Ge; Jo, Kang-Hyun

doi:10.1007/978-3-030-81638-4_25

Ge Cao⁷ &
Kang-Hyun Jo⁷

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1405))

Included in the following conference series:

International Workshop on Frontiers of Computer Vision

557 Accesses

Abstract

The deep learning method has achieved great success in saliency prediction task. Like depth and depth, the attention mechanism has been proved to be effective in enhancing the performance of Convolutional Neural Network (CNNs) in many studies. In this paper, we propose a new architecture that combines encoder-decoder architecture, multi-level integration, relation-aware global attention module. The encoder-decoder architecture is the main structure to extract deeper features. The multi-level integration constructs an asymmetric path that avoid information loss. The Relation-aware Global Attention module is used to enhance the network both channel-wise and spatial-wise. The architecture is trained and tested on SALICON 2017 benchmark and obtain competitive results compared with related research.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Schauerte, B., Richarz, J., Fink, G.A.: Saliency-based identification and recognition of pointed-at objects. In: 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems, Taipei, pp. 4638–4643 (2010). https://doi.org/10.1109/IROS.2010.5649430
Frintrop, S., Kessel, M.: Most salient region tracking. In: 2009 IEEE International Conference on Robotics and Automation, Kobe, pp. 1869–1874 (2009). https://doi.org/10.1109/ROBOT.2009.5152298
Saeko, T., Ramesh, R., Michael, G., Bruce, G.: Automatic image retargeting. In: Proceedings of the 4th International Conference on Mobile and Ubiquitous Multimedia, MUM 2005, vol. 154, pp. 59–68 (2005). https://doi.org/10.1145/1149488.1149499
Jiang, M., Huang, S., Duan, J., Zhao, Q.: SALICON: saliency in context. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, pp. 1072–1080 (2015). https://doi.org/10.1109/CVPR.2015.7298710
Reddy, N., Jain, S., Yarlagadda, P., Gandhi, V.: Tidying deep saliency prediction architectures (2020). abs/2003.04942
Google Scholar
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. arXiv e-prints. arXiv:1505.04597 (2015)
Zhang, Z., Lan, C., Zeng, W., Jin, X., Chen, Z.: Relation-aware global attention for person re-identification. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, pp. 3183–3192 (2020). https://doi.org/10.1109/CVPR42600.2020.00325
Cao, G., Tang, Q., Jo, K.: Aggregated deep saliency prediction by self-attention network. In: Huang, D.-S., Premaratne, P. (eds.) ICIC 2020. LNCS (LNAI), vol. 12465, pp. 87–97. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60796-8_8
Chapter Google Scholar
Itti, L., Koch, C., Niebur, E.: A model of saliency-based visual attention for rapid scene analysis. IEEE Trans. Pattern Anal. Mach. Intell. 20(11), 1254–1259 (1998). https://doi.org/10.1109/34.730558
Article Google Scholar
Bruce, N.D.B., Tsotsos, J.K.: Saliency, attention, and visual search: AN information theoretic approach. J. Vis. 9(3), 5 (2009)
Article Google Scholar
Borji, A.: Boosting bottom-up and top-down visual features for saliency estimation. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, pp. 438–445 (2012). https://doi.org/10.1109/CVPR.2012.6247706
Cornia, M., Baraldi, L., Serra, G., Cucchiara, R.: Predicting human eye fixations via an LSTM-based saliency attentive model. IEEE Trans. Image Process. 27(10), 5142–5154 (2018). https://doi.org/10.1109/TIP.2018.2851672
Article MathSciNet Google Scholar
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chapter Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv e-prints. arXiv:1409.1556 (2014)

Download references

Acknowledgement

This work was supported by the National Research Foundation of Korea (NRF) grant funded by the government (MSIT) (No. 2020R1A2C2008972).

Author information

Authors and Affiliations

School of Electrical Engineering, University of Ulsan, Ulsan, Republic of Korea
Ge Cao & Kang-Hyun Jo

Authors

Ge Cao
View author publications
You can also search for this author in PubMed Google Scholar
Kang-Hyun Jo
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kang-Hyun Jo .

Editor information

Editors and Affiliations

Chonnam National University, Gwangju, Korea (Republic of)
Hieyong Jeong
Aoyama Gakuin University, Kanagawa, Japan
Kazuhiko Sumi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Cao, G., Jo, KH. (2021). Saliency Prediction with Relation-Aware Global Attention Module. In: Jeong, H., Sumi, K. (eds) Frontiers of Computer Vision. IW-FCV 2021. Communications in Computer and Information Science, vol 1405. Springer, Cham. https://doi.org/10.1007/978-3-030-81638-4_25

Download citation

DOI: https://doi.org/10.1007/978-3-030-81638-4_25
Published: 15 July 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-81637-7
Online ISBN: 978-3-030-81638-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics