Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models

Pereira, Pedro; Linhares Silva, António; Machado, Rui; Silva, João; Durães, Dalila; Machado, José; Novais, Paulo; Monteiro, João; Melo-Pinto, Pedro; Fernandes, Duarte

doi:10.1007/978-3-031-16474-3_24

Pedro Pereira¹²,
António Linhares Silva¹²,
Rui Machado^12,13,
João Silva¹²,
Dalila Durães¹²,
José Machado¹²,
Paulo Novais¹²,
João Monteiro¹²,
Pedro Melo-Pinto^12,14 &
…
Duarte Fernandes^12,13

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13566))

Included in the following conference series:

EPIA Conference on Artificial Intelligence

1342 Accesses
2 Citations

Abstract

GPU servers have been responsible for the recent improvements in the accuracy and inference speed of the object detection models targeted to autonomous driving. However, its features, namely, power consumption and dimension, make its integration in autonomous vehicles impractical. Hybrid FPGA-CPU boards emerged as an alternative to server GPUs in the role of edge devices in autonomous vehicles. Despite their energy efficiency, such devices do not offer the same computational power as GPU servers and have fewer resources available. This paper investigates how to deploy deep learning models tailored to object detection in point clouds in edge devices for onboard real-time inference. Different approaches, requiring different levels of expertise in logic programming applied to FPGAs, are explored, resulting in three main solutions: utilization of software tools for model adaptation and compilation for a proprietary hardware IP; design and implementation of a hardware IP optimized for computing traditional convolutions operations; design and implementation of a hardware IP optimized for sparse convolutions operations. The performance of these solutions is compared in the KITTI dataset with computer performances. All the solutions resort to parallelism, quantization and optimized access control to memory to reduce the usage of logical FPGA resources, and improve processing time without significantly sacrificing accuracy. Solutions probed to be effective for real-time inference, power limited and space-constrained purposes.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Zhou, Y., Tuzel, O.: VoxelNet: end-to-end learning for point cloud based 3D object detection. In: Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (2018). https://doi.org/10.1109/CVPR.2018.00472
Fernandes, D., et al.: Point-cloud based 3D object detection and classification methods for self-driving applications: a survey and taxonomy. Inf. Fusion 68, 161–191 (2021). https://doi.org/10.1016/j.inffus.2020.11.002
Article Google Scholar
Yan, Y., Mao, Y., Li, B.: SECOND: sparsely embedded convolutional detection. Sensors 18(10), 3337 (2018). https://doi.org/10.3390/s18103337
Article Google Scholar
Engelcke, M., Rao, D., Wang, D.Z., Tong, C.H., Posner, I.: Vote3Deep: fast object detection in 3D point clouds using efficient convolutional neural networks (2016). http://arxiv.org/abs/1609.06666
Abdelouahab, K., Pelcat, M., Sérot, J., Bourrasset, C., Berry, F., Serot, J.: Tactics to directly map CNN graphs on embedded FPGAs. Comput. Vis. Pattern Recogn. (2017). https://doi.org/10.1109/LES.2017.2743247
Sharma, H., et al.: From high-level deep neural models to FPGAs. In: 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 1–12 (2016). https://doi.org/10.1109/MICRO.2016.7783720
Duarte, J., et al.: Fast inference of deep neural networks in FPGAs for particle physics. J. Instrum. (2018). https://doi.org/10.1088/1748-0221/13/07/P07027
Xilinx Inc.: Xilinx Vitis Unified Software Platform User Guide: System Performance Analysis (2021). https://www.xilinx.com/content/dam/xilinx/support/documentation/sw_manuals/xilinx2021_2/ug1145-sdk-system-performance.pdf
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. Comput. Vis. Pattern (2015). http://arxiv.org/abs/1512.03385
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. Comput. Vis. Pattern (2015). http://arxiv.org/abs/1506.02640
Chen, Y.-H., Krishna, T., Emer, J.S., Sze, V.: Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J. Solid-State Circuits 52(1), 127–138 (2017). https://doi.org/10.1109/JSSC.2016.2616357
Article Google Scholar
Jo, J., Kim, S., Park, I.-C.: Energy-efficient convolution architecture based on rescheduled dataflow. IEEE Trans Circuits Syst. I Regul. Pap. 65, 4196–4207 (2018). https://doi.org/10.1109/TCSI.2018.2840092
Article Google Scholar
Desoli, G., et al.: 14.1 A 2.9TOPS/W deep convolutional neural network SoC in FD-SOI 28 nm for intelligent embedded systems. In: 2017 IEEE International Solid-State Circuits Conference (ISSCC), pp. 238–239 (2017). https://doi.org/10.1109/ISSCC.2017.7870349
Pereira, P., Silva, J., Silva, A., Fernandes, D., Machado, R.: Efficient hardware design and implementation of the voting scheme-based convolution. Sensors 22 (2022). https://doi.org/10.3390/s22082943
Silva, J., Pereira, P., Machado, R., Névoa, R., Melo-Pinto, P., Fernandes, D.: Customizable FPGA-based hardware accelerator for standard convolution processes empowered with quantization applied to LiDAR data. Sensors 22(6), 2184 (2022). https://doi.org/10.3390/s22062184
Article Google Scholar
Silva, A., et al.: Resource-constrained onboard inference of 3D object detection and localisation in point clouds targeting self-driving applications. Sensors 21(23), 7933 (2021). https://doi.org/10.3390/s21237933
Article Google Scholar
Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J., Beijbom, O.: PointPillars: fast encoders for object detection from point clouds (2018). http://arxiv.org/abs/1812.05784

Download references

Acknowledgements

This work has been supported by FCT—Fundação para a Ciência e Tecnologia within the R&D Units Project Scope: UIDB/00319/2020 and the project “Integrated and Innovative Solutions for the well-being of people in complex urban centers” within the Project Scope NORTE-01-0145-FEDER-000086.

Author information

Authors and Affiliations

Algoritmi Centre, University of Minho, Guimarães, Portugal
Pedro Pereira, António Linhares Silva, Rui Machado, João Silva, Dalila Durães, José Machado, Paulo Novais, João Monteiro, Pedro Melo-Pinto & Duarte Fernandes
Associação Laboratório Colaborativo em Transformação Digital—DTx Colab, Guimarães, Portugal
Rui Machado & Duarte Fernandes
Department of Engineering, University of Trás-os-Montes and Alto Douro, Vila Real, Portugal
Pedro Melo-Pinto

Authors

Pedro Pereira
View author publications
You can also search for this author in PubMed Google Scholar
António Linhares Silva
View author publications
You can also search for this author in PubMed Google Scholar
Rui Machado
View author publications
You can also search for this author in PubMed Google Scholar
João Silva
View author publications
You can also search for this author in PubMed Google Scholar
Dalila Durães
View author publications
You can also search for this author in PubMed Google Scholar
José Machado
View author publications
You can also search for this author in PubMed Google Scholar
Paulo Novais
View author publications
You can also search for this author in PubMed Google Scholar
João Monteiro
View author publications
You can also search for this author in PubMed Google Scholar
Pedro Melo-Pinto
View author publications
You can also search for this author in PubMed Google Scholar
Duarte Fernandes
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to António Linhares Silva .

Editor information

Editors and Affiliations

ISEP/GECAD, Polytechnic Institute of Porto, Porto, Portugal
Goreti Marreiros
IST/INESC-ID, University of Lisbon, Lisbon, Portugal
Bruno Martins
IST/INESC-ID, University of Lisbon, Porto Salvo, Portugal
Ana Paiva
CISUC, University of Coimbra, Coimbra, Portugal
Bernardete Ribeiro
IST/INESC-ID, University of Lisbon, Porto Salvo, Portugal
Alberto Sardinha

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Pereira, P. et al. (2022). Comparison of Different Deployment Approaches of FPGA-Based Hardware Accelerator for 3D Object Detection Models. In: Marreiros, G., Martins, B., Paiva, A., Ribeiro, B., Sardinha, A. (eds) Progress in Artificial Intelligence. EPIA 2022. Lecture Notes in Computer Science(), vol 13566. Springer, Cham. https://doi.org/10.1007/978-3-031-16474-3_24

Download citation

DOI: https://doi.org/10.1007/978-3-031-16474-3_24
Published: 13 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-16473-6
Online ISBN: 978-3-031-16474-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics