FPGA-Based Dynamic Deep Learning Acceleration for Real-Time Video Analytics

Lu, Yufan; Gao, Cong; Saha, Rappy; Saha, Sangeet; McDonald-Maier, Klaus D.; Zhai, Xiaojun

doi:10.1007/978-3-031-21867-5_5

Yufan Lu¹¹,
Cong Gao¹¹,
Rappy Saha¹¹,
Sangeet Saha¹¹,
Klaus D. McDonald-Maier¹¹ &
…
Xiaojun Zhai¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13642))

Included in the following conference series:

International Conference on Architecture of Computing Systems

593 Accesses
2 Citations

Abstract

Deep neural networks (DNNs) are a key technique in modern artificial intelligence that has provided state-of-the-art accuracy on many applications, and they have received significant interest. The requirements for ubiquity of smart devices and autonomous robot systems are placing heavy demands on DNNs-inference hardware, with high requirement for energy and computing efficiencies, along with the rapid development of AI techniques. The high energy efficiency, computing capabilities, and reconfigurability of FPGAs make these a promising platform for hardware acceleration of such computing tasks. This paper primarily addresses this challenge and proposes a new flexible hardware accelerator framework to enable adaptive support for various DL algorithms on an FPGA-based edge computing platform. This framework allows run-time reconfiguration to increase power and computing efficiency of both DNN model/software and hardware, to meet the requirements of dedicated application specifications and operating environments. The achieved results show that with the proposed framework is capable to reduce energy consumption and processing time up to 53.8% and 36.5% respectively by switching to a smaller model. In addition, the time and energy consumption are further elaborated with a benchmark test set, which shows that how input data in each frame and size of a model can affect the performance of the system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Shawahna, A., Sait, S.M., El-Maleh, A.: FPGA-based accelerators of deep learning networks for learning and classification: a review. IEEE Access 7, 7823–7859 (2019)
Article Google Scholar
Lu, Y., Zhai, X., Saha, S., Ehsan, S., McDonald-Maier, K.D.: A self-adaptive SEU mitigation scheme for embedded systems in extreme radiation environments. IEEE Syst. J. 16(1), 1436–1447 (2022)
Article Google Scholar
Lübeck, K., Bringmann, O.: A heterogeneous and reconfigurable embedded architecture for energy-efficient execution of convolutional neural networks. In: Schoeberl, M., Hochberger, C., Uhrig, S., Brehm, J., Pionteck, T. (eds.) ARCS 2019. Lecture Notes in Computer Science(), vol. 11479, pp. 267–280. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-18656-2_20
Chapter Google Scholar
Umuroglu, Y., et al.: Finn: a framework for fast, scalable binarized neural network inference. New York, NY, USA: Association for Computing Machinery (2017). https://doi.org/10.1145/3020078.3021744
Haris, J., Gibson, P., Cano, J., Agostini, N.B., Kaeli, D.: SECDA: efficient hardware/software co-design of FPGA-based DNN accelerators for edge inference, CoRR, vol. abs/2110.00478 (2021). https://arxiv.org/abs/2110.00478
Zhang, X., et al.: DNNExplorer: a framework for modeling and exploring a novel paradigm of FPGA-based DNN accelerator. In: Proceedings of the 39th International Conference on Computer-Aided Design, ser. ICCAD 2020. New York, NY, USA. Association for Computing Machinery (2020). https://doi.org/10.1145/3400302.3415609
Taylor, B., Marco, V.S., Wolff, W., Elkhatib, Y., Wang, Z.: Adaptive deep learning model selection on embedded systems. In: SIGPLAN Notices, vol. 53, no. 6, p. 31–43 (2018). https://doi.org/10.1145/3299710.3211336
Lou, W., Xun, L., Sabet, A., Bi, J., Hare, J., Merrett, G.V.: Dynamic-OFA: runtime DNN architecture switching for performance scaling on heterogeneous embedded platforms. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), vol. 2021, pp. 3104–3112 (2021)
Google Scholar
Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(55), 1–21 (2019). http://jmlr.org/papers/v20/18-598.html
Cai, H., Gan, C., Wang, T., Zhang, Z., Han, S.: Once for all: train one network and specialize it for efficient deployment. In: International Conference on Learning Representations (2020). https://arxiv.org/pdf/1908.09791.pdf
Kathail, V.: Xilinx vitis unified software platform. In: Proceedings of the 2020 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, pp. 173–174 (2020)
Google Scholar
Xilinx vitis-ai 1.4 release (2011). https://github.com/Xilinx/Vitis-AI
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Los Alamitos, CA, USA: IEEE Computer Society, pp. 770–778 (2016). https://doi.ieeecomputersociety.org/10.1109/CVPR.2016.90
Howard, A.G., et al.: Mobilenets: efficient convolutional neural networks for mobile vision applications (2017)
Google Scholar
Xilinx, H.264/H.265 Video Codec Unit v1.2, Technical report (2021). https://www.xilinx.com/support/documentation/ip_documentation/vcu/v1_2/pg252-vcu.pdf
Lu, Y., Zhai, X., Saha, S., Ehsan, S., McDonald-Maier, K.D.: FPGA based adaptive hardware acceleration for multiple deep learning tasks. In: 2021 IEEE 14th International Symposium on Embedded Multicore (2021)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement (2018). https://arxiv.org/abs/1804.02767v1
Fu, K., Zhao, Q., Gu, I.Y.-H.: Refinet: a deep segmentation assisted refinement network for salient object detection. IEEE Trans. Multimedia 21(2), 457–469 (2018)
Article Google Scholar

Download references

Acknowledgment

This work is supported by the UK Engineering and Physical Sciences Research Council through grants EP/R02572X/1, EP/P017487/1, EP/V034111/1, EP/X015955/1 and EP/V000462/1.

Author information

Authors and Affiliations

University of Essex, Colchester, UK
Yufan Lu, Cong Gao, Rappy Saha, Sangeet Saha, Klaus D. McDonald-Maier & Xiaojun Zhai

Authors

Yufan Lu
View author publications
You can also search for this author in PubMed Google Scholar
Cong Gao
View author publications
You can also search for this author in PubMed Google Scholar
Rappy Saha
View author publications
You can also search for this author in PubMed Google Scholar
Sangeet Saha
View author publications
You can also search for this author in PubMed Google Scholar
Klaus D. McDonald-Maier
View author publications
You can also search for this author in PubMed Google Scholar
Xiaojun Zhai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cong Gao .

Editor information

Editors and Affiliations

Technical University of Munich, Garching, Germany
Martin Schulz
Technical University of Munich, Heilbronn, Germany
Carsten Trinitis
Chalmers University of Technology, Gothenburg, Sweden
Nikela Papadopoulou
Otto-von-Guericke-Universität Magdeburg, Magdeburg, Germany
Thilo Pionteck

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lu, Y., Gao, C., Saha, R., Saha, S., McDonald-Maier, K.D., Zhai, X. (2022). FPGA-Based Dynamic Deep Learning Acceleration for Real-Time Video Analytics. In: Schulz, M., Trinitis, C., Papadopoulou, N., Pionteck, T. (eds) Architecture of Computing Systems. ARCS 2022. Lecture Notes in Computer Science, vol 13642. Springer, Cham. https://doi.org/10.1007/978-3-031-21867-5_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-21867-5_5
Published: 14 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-21866-8
Online ISBN: 978-3-031-21867-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

FPGA-Based Dynamic Deep Learning Acceleration for Real-Time Video Analytics