Abstract
Object transfiguration aims to translate objects in image from a kind to another, which is a subtask of image translation. Recently, researchers have proposed many effective approaches for object transfiguration. However, most of them ignore the difference between target objects and background, which would make background deformation, discolor and other problems. We propose a novel attention-based model for unsupervised object transfiguration called Deep Attention Units Generative Adversarial Network (DAU-GAN). We utilize spatial consistencies of objects and background to enable model to preserve background of image. Such an attention-based design enables DAU-GAN to enhance the expression of meaningful features and let the model able to distinguish specific objects and background in images. Experimental results demonstrate that our approach improves the performance of object transfiguration as well as effectively preserves background.
Z. Ye and F. Lyu—The first two authors contributed to this work equally.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Johnson, J., Alahi, A., Fei-Fei, L.: Perceptual losses for real-time style transfer and super-resolution. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9906, pp. 694–711. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46475-6_43
Isola, P., Zhu, J.Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. In: CVPR, pp. 1125–1134 (2017)
Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: CVPR, pp. 4681–4690 (2017)
Zhang, H., et al.: Stackgan: Text to photo-realistic image synthesis with stacked generative adversarial networks. In: ICCV, pp. 5907–5915 (2017)
Feng, Y., Ren, J., Jiang, J.: Object-based 2D-to-3D video conversion for effective stereoscopic content generation in 3D-TV applications. IEEE Trans. Broadcast 57(2), 500–509 (2011)
Yan, Y., et al.: Cognitive fusion of thermal and visible imagery for effective detection and tracking of pedestrians in videos. Cogn. Comput. 10(1), 94–104 (2018)
Han, J., Zhang, D., Hu, X., Guo, L., Ren, J., Wu, F.: Background prior-based salient object detection via deep reconstruction residual. IEEE Trans. Circuits Syst. Video Technol. 25(8), 1309–1321 (2015)
Ren, J., Jiang, J., Wang, D., Ipson, S.: Fusion of intensity and inter-component chromatic difference for effective and robust colour edge detection. IET Image Process. 4(4), 294–301 (2010)
Han, J., Zhang, D., Cheng, G., Guo, L., Ren, J.: Object detection in optical remote sensing images based on weakly supervised learning and high-level feature learning. IEEE Trans. Geosci. Remote Sens. 53(6), 3325–3337 (2015)
Fu, X., Huang, J., Zeng, D., Huang, Y., Ding, X., Paisley, J.: Removing rain from single images via a deep detail network. In: CVPR, pp. 3855–3863 (2017)
Liu, M.Y., Breuel, T., Kautz, J.: Unsupervised image-to-image translation networks. In: NIPS, pp. 700–708 (2017)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: NIPS, pp. 1097–1105 (2012)
Liao, J., Yao, Y., Yuan, L., Hua, G., Kang, S.B.: Visual attribute transfer through deep image analogy. ACM Trans. Graph. 36(4), 120 (2017)
Choi, Y., Choi, M., Kim, M., Ha, J.W., Kim, S., Choo, J.: Stargan: Unified generative adversarial networks for multi-domain image-to-image translation. arXiv preprint arXiv:1711.09020 (2017)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: CVPR, pp. 2223–2232 (2017)
Yi, Z., Zhang, H., Tan, P., Gong, M.: Dualgan: unsupervised dual learning for image-to-image translation. In: CVPR, pp. 2849–2857 (2017)
Zhao, B., Feng, J., Wu, X., Yan, S.: A survey on deep learning-based fine-grained object classification and semantic segmentation. Int. J. Autom. Comput. 14(2), 119–135 (2017)
Yan, Y., et al.: Unsupervised image saliency detection with gestalt-laws guided optimization and visual attention based refinement. Pattern Recogn. 79, 65–78 (2018)
Goodfellow, I., et al.: Generative adversarial nets. In: NIPS, pp. 2672–2680 (2014)
Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434 (2015)
Zhu, J.-Y., Krähenbühl, P., Shechtman, E., Efros, A.A.: Generative visual manipulation on the natural image manifold. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9909, pp. 597–613. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46454-1_36
Briggs, F., Mangun, G.R., Usrey, W.M.: Attention enhances synaptic efficacy and the signal-to-noise ratio in neural circuits. Nature 499(7459), 476 (2013)
Aboudib, A., Gripon, V., Coppin, G.: A biologically inspired framework for visual information processing and an application on modeling bottom-up visual attention. Cogn. Comput. 8(6), 1007–1026 (2016)
Ma, S., Fu, J., Chen, C.W., Mei, T.: Da-gan: Instance-level image translation by deep attention generative adversarial networks (with supplementary materials). In: CVPR (2018)
Wang, F., et al.: Residual attention network for image classification. In: CVPR pp. 3156–3164 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: CVPR pp. 770–778 (2016)
Fu, J., Zheng, H., Mei, T.: Look closer to see better: Recurrent attention convolutional neural network for fine-grained image recognition. In: CVPR pp. 4438–4446 (2017)
Xu, K., et al.: Show, attend and tell: Neural image caption generation with visual attention. In: ICML pp. 2048–2057 (2015)
Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics pp. 315–323 (2011)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: CVPR pp. 248–255 (2009)
Acknowledgments
This work was supported by the Natural Science Foundation of China (Nos. 61472267, 61728205, 61502329, 61672371), Primary Research & Developement Plan of Jiangsu Province (No. BE2017663) and Aeronautical Science Foundation (20151996016).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Ye, Z., Lyu, F., Ren, J., Sun, Y., Fu, Q., Hu, F. (2018). DAU-GAN: Unsupervised Object Transfiguration via Deep Attention Unit. In: Ren, J., et al. Advances in Brain Inspired Cognitive Systems. BICS 2018. Lecture Notes in Computer Science(), vol 10989. Springer, Cham. https://doi.org/10.1007/978-3-030-00563-4_12
Download citation
DOI: https://doi.org/10.1007/978-3-030-00563-4_12
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00562-7
Online ISBN: 978-3-030-00563-4
eBook Packages: Computer ScienceComputer Science (R0)