Siamese network for real-time tracking with action-selection

Zhang, Zhuoyi; Zhang, Yifeng; Cheng, Xu; Li, Ke

doi:10.1007/s11554-019-00922-6

Siamese network for real-time tracking with action-selection

Original Research Paper
Published: 12 November 2019

Volume 17, pages 1647–1657, (2020)
Cite this article

Journal of Real-Time Image Processing Aims and scope Submit manuscript

Zhuoyi Zhang¹,
Yifeng Zhang^1,2,3,
Xu Cheng⁴ &
…
Ke Li¹

494 Accesses
6 Citations
Explore all metrics

Abstract

Considering that most deep learning based trackers capture accurate locations for targets at the expense of consuming much time in training phrase, in this paper we present a new powerful tracker using the Siamese network which can be implemented with low computation resource. Our proposed tracker can track targets accurately by a fine-tuned model which is convenient to train. During the tracking, we apply a new sampling method that is independent of training called action-selection to conduct selective and flexible sampling step by step with a variable stride, by which we can get bounding boxes with varied aspect radio. By verifying its performance on online tracking benchmarks, it turns out that our tracker achieves higher accuracy than most traditional trackers. In addition, our tracker operates at frame-rates beyond real-time.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Object detection using YOLO: challenges, architectural successors, datasets and applications

Article 08 August 2022

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Yi, W., Lim, J., Yang, M.H.: Online object tracking: a benchmark. In: Computer Vision and Pattern Recognition (2013)
Porikli, F.: Achieving real-time object detection and tracking under extreme conditions. J. Real-Time Image Process. 1(1), 33–40 (2006)
Article Google Scholar
Razavian, A.S., Azizpour, H., Sullivan, J., Carlsson, S.: Cnn features off-the-shelf: An astounding baseline for recognition (2014)
Lee, S.H., Jang, W.D., Kim, C.S.: Tracking-by-segmentation using superpixel-wise neural network. IEEE Access 99, 1–1 (2018)
Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems (2012)
Bromley, J., Guyon, I., Lecun, Y., Shah, R.: Signature verification using a “siamese” time delay neural network. In: International Conference on Neural Information Processing Systems (1993)
Taigman, Y., Ming, Y., Ranzato, M., Wolf, L.: Deepface: Closing the gap to human-level performance in face verification. In: IEEE Conference on Computer Vision and Pattern Recognition (2014)
Zagoruyko, S., Komodakis, N.: Learning to compare image patches via convolutional neural networks (2015)
Li, B., Yan, J., Wu, W., Zhu, Z., Hu, X.: High performance visual tracking with siamese region proposal. In: IEEE conference on computer vision and pattern recognition (2018)
Li, B., Wu, W., Wang, Q., Zhang, F., Yan, J.: Siamrpn++: Evolution of siamese visual tracking with very deep networks (2018)
Tao, R., Gavves, E., Smeulders, A.W.M.: Siamese instance search for tracking. In: IEEE conference on computer vision and pattern recognition (2016)
Kai, C., Tao, W.: Convolutional regression for visual tracking. IEEE Trans. Image Process. 99, 1–1 (2016)
MATH Google Scholar
Smeulders, A.W.M., Chu, D.M., Rita, C., Simone, C., Afshin, D., Mubarak, S.: Visual tracking: an experimental survey. IEEE Trans. Pattern Anal. Mach. Intell. 36(7), 1442–1468 (2014)
Article Google Scholar
Ning, G., Zhi, Z., Chen, H., He, Z., Wang, H.: Spatially supervised recurrent convolutional neural networks for visual object tracking. In: IEEE International Symposium on Circuits and Systems (2017)
Hare, S., Saffari, A., Torr, P.H.S.: Struck: Structured output tracking with kernels. In: International Conference on Computer Vision (2011)
Bertinetto, L., Valmadre, J., Henriques, J.F., Vedaldi, A., Torr, P.H.S.: Fully-convolutional siamese networks for object tracking. In: European Conference on Computer Vision (2016)
Guo, Q., Wei, F., Zhou, C., Rui, H., Song, W.: Learning dynamic siamese network for visual object tracking. In: IEEE international conference on computer vision (2017)
Wang, N., Yeung, D.Y.: Learning a deep compact image representation for visual tracking. In: International Conference on Neural Information Processing Systems (2013)
Nam, H., Han, B.: Learning multi-domain convolutional neural networks for visual tracking (2015)
Chao, M., Huang, J.B., Yang, X., Yang, M.H.: Hierarchical convolutional features for visual tracking. In: IEEE International Conference on Computer Vision (2016)
Wang, L., Ouyang, W., Wang, X., Lu, H.: Visual tracking with fully convolutional networks. In: IEEE international conference on computer vision (2016)
Fan, H., Ling, H.: Parallel tracking and verifying: A framework for real-time and high accuracy visual tracking. In: 2017 IEEE International Conference on Computer Vision (ICCV) (2017)
Chong, S., Lu, H., Yang, M.H.: Learning spatial-aware regressions for visual tracking (2018)
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S., Huang, Z., Karpathy, A., Khosla, A., Bernstein, M.: Imagenet large scale visual recognition challenge. Int. J. Comput. Vis. 115(3), 211–252 (2015)
Article MathSciNet Google Scholar
Kai, C., Tao, W.: Once for all: a two-flow convolutional neural network for visual tracking. IEEE Trans. Circ. Syst. Video Technol. 99, 1–1 (2016)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Computer Science (2014)
Held, D., Thrun, S., Savarese, S.: Learning to track at 100 fps with deep regression networks (2016)
Huang, C., Lucey, S., Ramanan, D.: Learning policies for adaptive tracking with deep feature cascades (2017)
Song, Y., Chao, M., Gong, L., Zhang, J., Yang, M.H.: Crest: Convolutional residual learning for visual tracking. In: IEEE International Conference on Computer Vision (2017)
Real, E., Shlens, J., Mazzocchi, S., Xin, P., Vanhoucke, V.: Youtube-boundingboxes: a large high-precision human-annotated data set for object detection in video (2017)
Girshick, R.: Fast r-cnn. Computer Science (2015)
Schroff, F., Kalenichenko, D., Philbin, J.: Facenet: A unified embedding for face recognition and clustering. In: Computer Vision and Pattern Recognition (2015)
Yi, W., Jongwoo, L., Ming-Hsuan, Y.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Article Google Scholar
Kristan, M., Eldesokey, A., Xing, Y., Fan, Y., Zheng, Z., Zhang, Z., He, Z., Fernandez, G., Garciamartin, A., Muhic, A.: The visual object tracking vot2017 challenge results. In: IEEE International Conference on Computer Vision Workshop (2017)
Yang, L., Liu, R., Zhang, D., Lei, Z.: Deep location-specific tracking. In: ACM on Multimedia Conference (2017)
Zhang, J., Ma, S., Sclaroff, S.: Meem: Robust tracking via multiple experts using entropy minimization (2014)
Martin, D., Gustav, H., Fahad, S.K., Michael, F.: Accurate scale estimation for robust visual tracking. In: British machine vision conference (2014)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)
Article Google Scholar
Yang, M.H., Lu, H., Wei, Z.: Robust object tracking via sparsity-based collaborative model. In: Computer Vision and Pattern Recognition (2012)
Hare, S., Golodetz, S., Saffari, A., Vineet, V., Cheng, M.M., Hicks, S.L., Torr, P.H.S.: Struck: Structured output tracking with kernels. IEEE Trans. Pattern Anal. Mach. Intell. 38(10), 2096–2109 (2015)
Article Google Scholar

Download references

Funding

This work was supported by the Natural Science Foundation of Jiangsu Province under (Grant no. BK20151102), Natural Science Foundation of China (Grant no. 61673108), Ministry of Education Key Laboratory of Machine Perception, Peking University (Grant no. K-2016-03), Open Project Program of the Ministry of Education Key Laboratory of Underwater Acoustic Signal Processing, Southeast University (Grant no. UASP1502) and Natural Science Foundation of China (Grant no. 61802058).

Author information

Authors and Affiliations

School of Information Science and Engineering, Southeast University, Nanjing, 210096, China
Zhuoyi Zhang, Yifeng Zhang & Ke Li
Nanjing Institute of Communications Technologies, Nanjing, 211100, China
Yifeng Zhang
State Key Laboratory for Novel Software Technology, Nanjing University, Nanjing, 210093, China
Yifeng Zhang
School of Computer and Software, Nanjing University of Information Science and Technology, Nanjing, 210044, China
Xu Cheng

Authors

Zhuoyi Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yifeng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xu Cheng
View author publications
You can also search for this author in PubMed Google Scholar
Ke Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Zhuoyi Zhang.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Zhang, Z., Zhang, Y., Cheng, X. et al. Siamese network for real-time tracking with action-selection. J Real-Time Image Proc 17, 1647–1657 (2020). https://doi.org/10.1007/s11554-019-00922-6

Download citation

Received: 22 May 2019
Accepted: 10 October 2019
Published: 12 November 2019
Issue Date: October 2020
DOI: https://doi.org/10.1007/s11554-019-00922-6

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Siamese network for real-time tracking with action-selection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Siamese network for real-time tracking with action-selection

Abstract

Access this article

Similar content being viewed by others

Object detection using YOLO: challenges, architectural successors, datasets and applications

SSD: Single Shot MultiBox Detector

End-to-End Object Detection with Transformers

References

Funding

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation