Abstract
In this paper, we initially propose a novel framework for replacing advertisement contents in soccer videos in an automatic way by using deep learning strategies. For this purpose, we begin by applying UNET (an image segmentation convolutional neural network technique) for content segmentation and detection. Subsequently, after reconstructing the segmented content in the video frames (considering the apparent loss in detection), we will replace the unwanted content by new one using a homography mapping procedure. Furthermore, the replacement key points will be tracked into the next frames considering the zoom-in and zoom-out controlling using multiplication of the key point coordinates by the homography matrix between each two consecutive frames. Since the movement of objects in video can disrupt the alignment between frames and correspondingly make the homography matrix calculation erroneous, we use Mask R-CNN algorithm to mask and remove the moving objects from the scene. Accordingly, the replacement will be consistent to the video motion of scene. Such framework is denominated as REP-Model which stands for a replacing model. In addition, we have examined the REP-Model over a large database regarding soccer match videos for removing and replacing the playground billboard contents and the results reveal the discriminative nature of our proposed framework. Furthermore, in order to key out the covered object beneath the new content, we use an unsupervised approach in an adversarial learning set-up by learning object masks with playing a game of cut-and-paste, using a discriminator model to find out whether the covered object has been revealed correctly.
Similar content being viewed by others
References
Aldershoff F, Gevers T (2003) Visual tracking and localization of billboards in streamed soccer matches. In: Storage and retrieval methods and applications for multimedia 2004, vol 5307. International Society for Optics and Photonics, pp 408–416
Algarni A D (2020) Efficient object detection and classification of heat emitting objects from infrared images based on deep learning. Multimed Tools Appl 79:1–24
Bengani S, Vadivel S et al (2020) Automatic segmentation of optic disc in retinal fundus images using semi-supervised deep learning. Multimed Tools Appl 80:1–26
Brock A, Donahue J, Simonyan K (2018) Large scale gan training for high fidelity natural image synthesis. arXiv:1809.11096
Brown M, Lowe D G (2007) Automatic panoramic image stitching using invariant features. Int J Comput Vis 74(1):59–73
Burgess C P, Matthey L, Watters N, Kabra R, Higgins I, Botvinick M, Lerchner A (2019) Monet: unsupervised scene decomposition and representation. arXiv:1901.11390
Caelles S, Maninis K K, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2017) One-shot video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 221–230
Cai G, Chen L, Li J (2003) Billboard advertising detection in sport tv. In: Seventh international symposium on signal processing and its applications, 2003. Proceedings, vol 1. IEEE, pp 537–540
Cao X, Gao S, Chen L, Wang Y (2020) Ship recognition method combined with image segmentation and deep learning feature extraction in video surveillance. Multimed Tools Appl 79(13):9177–9192
Chen M, Artières T, Denoyer L (2019) Unsupervised object segmentation by redrawing. In: Advances in neural information processing systems, pp 12705–12716
Cheng J, Tsai Y H, Hung W C, Wang S, Yang M H (2018) Fast and accurate online video object segmentation via tracking parts. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7415–7424
Chum O, Matas J (2008) Optimal randomized ransac. IEEE Trans Pattern Anal Mach Intell 30(8):1472–1482
Dean J, Corrado G, Monga R, Chen K, Devin M, Mao M, Ranzato M, Senior A, Tucker P, Yang K et al (2012) Large scale distributed deep networks. In: Advances in neural information processing systems, pp 1223–1231
Deng J, Dong W, Socher R, Li L J, Li K, Fei-Fei L (2009) Imagenet: a large-scale hierarchical image database. In: 2009 IEEE conference on computer vision and pattern recognition. IEEE, pp 248–255
Dewi C, Chen R C, Yu H (2020) Weight analysis for various prohibitory sign detection and recognition using deep learning. Multimed Tools Appl 79:1–19
Egilmez H E, Chao Y H, Ortega A (2020) Graph-based transforms for video coding. IEEE Trans Image Process 29:9330–9344
Eslami S A, Heess N, Weber T, Tassa Y, Szepesvari D, Hinton G E et al (2016) Attend, infer, repeat: fast scene understanding with generative models. In: Advances in neural information processing systems, pp 3225–3233
Feng Z, Neumann J (2013) Real time commercial detection in videos
Gao Z, Zhang H, Dong S, Sun S, Wang X, Yang G, Wu W, Li S, de Albuquerque V H C (2020) Salient object detection in the distributed cloud-edge intelligent network. IEEE Netw 34(2):216–224
Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Gregor K, Danihelka I, Graves A, Rezende D J, Wierstra D (2015)
Gruosso M, Capece N, Erra U (2020) Human segmentation in surveillance video with deep learning. Multimed Tools Appl 80:1–25
Guo J, Bai H, Tang Z, Xu P, Gan D, Liu B Multi modal human action recognition for video content matching
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE international conference on computer vision, pp 2961–2969
Hosang J, Benenson R, Dollár P, Schiele B (2015) What makes for effective detection proposals? IEEE Trans Pattern Anal Mach Intell 38(4):814–830
Hossari M, Dev S, Nicholson M, McCabe K, Nautiyal A, Conran C, Tang J, Xu W, Pitié F (2018) Adnet: a deep network for detecting adverts. arXiv:1811.04115
Hou S, Zhou S, Liu W, Zheng Y (2018) Classifying advertising video by topicalizing high-level semantic concepts. Multimed Tools Appl 77 (19):25475–25511
Hu Y T, Huang J B, Schwing A (2017) Maskrnn: instance level video object segmentation. In: Advances in neural information processing systems, pp 325–334
Hu P, Wang G, Kong X, Kuen J, Tan Y P (2018) Motion-guided cascaded refinement network for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1400–1409
Hussain Z, Zhang M, Zhang X, Ye K, Thomas C, Agha Z, Ong N, Kovashka A (2017) Automatic understanding of image and video advertisements. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1705–1715
Jang S W, Ahn B (2019) Effective detection of exposed target regions based on deep learning from multimedia data. Multimed Tools Appl 79:1–17
Ji X, Henriques J F, Vedaldi A (2018) Invariant information distillation for unsupervised image segmentation and clustering. arXiv:1807.06653
Jindal N et al (2020) Copy move and splicing forgery detection using deep convolution neural network, and semantic segmentation. Multimed Tools Appl 80:1–29
Kanezaki A (2018) Unsupervised image segmentation by backpropagation. In: 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE, pp 1543–1547
Khoreva A, Benenson R, Ilg E, Brox T, Schiele B (2017) Lucid data dreaming for object tracking. In: The DAVIS challenge on video object segmentation
Kim Y, Jung S, Ji S, Hwang E, Rho S (2019) Iot-based personalized nie content recommendation system. Multimed Tools Appl 78(3):3009–3043. https://doi.org/10.1007/s11042-020-09603-0
Kim D Y, Park J H, Lee Y, Kim S (2020) Network virtualization for real-time processing of object detection using deep learning. Multimed Tools Appl 1–19
Kosub S (2019) A note on the triangle inequality for the jaccard distance. Pattern Recogn Lett 120:36–38
Krizhevsky A, Sutskever I, Hinton G E (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105
Lee H, Eum S, Kwon H (2019) Me r-cnn: MULti-expert r-cnn for object detection. IEEE Trans Image Process 29:1030–1044
Levandowsky M, Winter D (1971) Distance between sets. Nature 234(5323):34–35
Li Y, Tang S, Zhang R, Zhang Y, Li J, Yan S (2019) Asymmetric gan for unpaired image-to-image translation. IEEE Trans Image Process 28(12):5881–5896
Lim J H, Ye J C (2017) Geometric gan. arXiv:1705.02894
Lipkus A H (1999) A proof of the triangle inequality for the tanimoto distance. J Math Chem 26(1-3):263–265
Liu J, Wang C, Su H, Du B, Tao D (2019) Multistage gan for fabric defect detection. IEEE Trans Image Process 29:3388–3400
Long J, Shelhamer E, Darrell T (2015) Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3431–3440
Lucas A, Lopez-Tapia S, Molina R, Katsaggelos A K (2019) Generative adversarial networks and perceptual losses for video super-resolution. IEEE Trans Image Process 28(7):3312–3327
Maninis K K, Caelles S, Chen Y, Pont-Tuset J, Leal-Taixé L, Cremers D, Van Gool L (2018) Video object segmentation without temporal information. IEEE Trans Pattern Anal Mach Intell 41(6):1515–1530
Miyato T, Kataoka T, Koyama M, Yoshida Y (2018) Spectral normalization for generative adversarial networks. arXiv:1802.05957
Moulton R, Jiang Y (2018) Maximally consistent sampling and the jaccard index of probability distributions. arXiv:1809.04052
Ostyakov P, Suvorov R, Logacheva E, Khomenko O, Nikolenko S I (2018) Seigan: towards compositional image generation by simultaneously learning to segment, enhance, and inpaint. arXiv:1811.07630
Perazzi F, Khoreva A, Benenson R, Schiele B, Sorkine-Hornung A (2017) Learning video object segmentation from static images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2663–2672
Pham T T, Do T T, Sünderhauf N, Reid I (2018) Scenecut: joint geometric and object segmentation for indoor scenes. In: 2018 IEEE international conference on robotics and automation (ICRA). IEEE, pp 1–9
Remez T, Huang J, Brown M (2018) Learning to segment via cut-and-paste. In: Proceedings of the European conference on computer vision (ECCV), pp 37–52
Ronneberger O, Fischer P, Brox T (2015) U-net: convolutional networks for biomedical image segmentation. In: International conference on medical image computing and computer-assisted intervention. Springer, pp 234–241
Rublee E, Rabaud V, Konolige K, Bradski G (2011) Orb: an efficient alternative to sift or surf. In: 2011 International conference on computer vision. IEEE, pp 2564–2571
Sakthivelan R, Rjendran P, Thangavel M (2020) A video analysis on user feedback based recommendation using a-fp hybrid algorithm. Multimed Tools Appl 79(5):3847–3859
Sbai O, Couprie C, Aubry M (2018) Vector image generation by learning parametric layer decomposition. arXiv:1812.05484
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: European conference on computer vision. Springer, pp 746–760
Tran D, Ranganath R, Blei D M (2017) Deep and hierarchical implicit models. arXiv:1702.08896, 7, 3
Uijlings J R, Van De Sande K E, Gevers T, Smeulders A W (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for the 2017 Davis challenge on video object segmentation. In: The 2017 DAVIS challenge on video object segmentation-CVPR workshops, vol 5
Voigtlaender P, Leibe B (2017) Online adaptation of convolutional neural networks for video object segmentation. arXiv:1706.09364
Voigtlaender P, Chai Y, Schroff F, Adam H, Leibe B, Chen L C (2019) Feelvos: fast end-to-end embedding learning for video object segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 9481–9490
Watve A, Sural S (2008) Soccer video processing for the detection of advertisement billboards. Pattern Recogn Lett 29(7):994–1006
Wei W, Fan X, Song H, Wang H (2019) Video tamper detection based on multi-scale mutual information. Multimed Tools Appl 78(19):27109–27126
Xia X, Kulis B (2017) W-net: a deep model for fully unsupervised image segmentation. arXiv:1711.08506
Xiao Y, Tian Z, Yu J, Zhang Y, Liu S, Du S, Lan X (2020) A review of object detection based on deep learning. Multimed Tools Appl 79:1–63
Yang J, Kannan A, Batra D, Parikh D (2017) Lr-gan: layered recursive generative adversarial networks for image generation. arXiv:1703.01560
Yang L, Wang Y, Xiong X, Yang J, Katsaggelos A K (2018) Efficient video object segmentation via network modulation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6499–6507
Yong B, Wang C, Shen J, Li F, Yin H, Zhou R (2020) Automatic ventricular nuclear magnetic resonance image processing with deep learning. Multimed Tools Appl 1–17. https://doi.org/10.1007/s11042-020-08911-9
Zhang Z (2000) A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 22(11):1330–1334
Acknowledgements
The completion of this research was made possible thanks to the Natural Sciences and Engineering research Council of Canada (NSERC). In addition, the authors would like to thank Edouard Geze, Adam Alcolado and Robert Graham for their assistance during the project.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Ghassab, V.K., Maanicshah, K., Green, P. et al. Content modification of soccer videos using a supervised deep learning framework. Multimed Tools Appl 81, 481–503 (2022). https://doi.org/10.1007/s11042-021-11383-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-021-11383-0