SPACE: Sparsity Propagation Based DCNN Training Accelerator on Edge

Wang, Miao; Chen, Zhen; Li, Chuxi; Yang, Zhao; Li, Lei; Zhang, Meng; Zhang, Shengbing

doi:10.1007/978-3-030-95388-1_26

Miao Wang¹⁴,
Zhen Chen¹⁵,
Chuxi Li¹⁴,
Zhao Yang¹⁴,
Lei Li¹⁵,
Meng Zhang¹⁴ &
…
Shengbing Zhang¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 13156))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1792 Accesses

Abstract

On-edge learning enables edge devices to continually adapt to the new data of AI applications. However, much more computing capacity and memory space are needed to achieve the batch backward propagation oriented training, in which the system power budget is limited by the edge circumstance. As the operands are propagated during the training, useless zero values are inevitably propagated, which will cause unnecessary waste of memory accesses and computations. This paper conducted a thorough analysis of the origin of sparsity in all three phases of the training based on sparse propagation and gives three insights about the absolute sparsity and the nonabsolute sparsity found for efficient deployment of the training process. An efficient training accelerator named SPACE which can not only reduce memory footprint but also delete a massive amount of computations by exploiting the nonabsolute sparsity and the absolute sparsity is proposed. SPACE can improve performance and energy efficiency by a factor of 3.2x and 2.8x, respectively, compared with dense training architecture.

Supported by The Laboratory Open Fund of Beijing Smart-chip Microelectronics Technology Co., Ltd. grant number SGTYHT/20-JS-221.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Google Scholar
Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Zhu, Z.: Deep Speech 2: end-to-end speech recognition in English and Mandarin. Computer Science (2015)
Google Scholar
Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. In: International Symposium (2017)
Google Scholar
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. arXiv:1806.09055 (2018)
Choi, S., Sim, J., Kang, M., et al.: An energy-efficient deep convolutional neural network training accelerator for in situ personalization on smart devices. IEEE J. Solid-State Circuits 55, 2691–2702 (2020)
Article Google Scholar
Zhe, Y., Yue, J., Yang, H., et al.: Sticker: a 0.41-62.1 TOPS/W 8bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In: 2018 IEEE Symposium on VLSI Circuits. IEEE (2018)
Google Scholar
Lee, G., et al.: Acceleration of DNN backward propagation by selective computation of gradients. In: DAC 2019. ACM (2019)
Google Scholar
Wang, M., Fan, X., Zhang, W., et al.: Balancing memory-accessing and computing over sparse DNN accelerator via efficient data packaging. J. Syst. Architect. 117(1), 102094 (2021)
Article Google Scholar
Chen, Y.H., Emer, J., Sze, V.: Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks (2016)
Google Scholar
Li, C., Fan, X., Geng, Y., et al.: ENAS oriented layer adaptive data scheduling strategy for resource limited hardware. Neurocomputing 381(1), 29–39 (2019)
Article Google Scholar
Zhao, Y., Chen, X., Wang, Y., Li, C., Lin, Y.: SmartExchange: trading higher-cost memory storage/access for lower-cost computation. In: 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (2020)
Google Scholar
Yang, X., et al.: Interstellar: Using Halide’s scheduling language to analyze DNN accelerators. In: Larus, J.R., Ceze, L., Strauss, K. (eds.) ASPLOS 2020: Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 16–20 March 2020. ACM (2020)
Google Scholar
Mahmoud, M., Edo, I., Zadeh, A.H., et al.: TensorDash: exploiting sparsity to accelerate deep neural network training and inference (2020)
Google Scholar
Liu, L., et al.: DUET: boosting deep neural network efficiency on dual-module architecture. In: MICRO 2020, pp. 738–750 (2020)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

Download references

Author information

Authors and Affiliations

Northwestern Polytechnical University, Xi’an, 710129, China
Miao Wang, Chuxi Li, Zhao Yang, Meng Zhang & Shengbing Zhang
National and Local Joint Engineering Research Center for Reliability Technology of Energy Internet Intelligent Terminal Core Chip, Beijing Smart-Chip Microelectronics Technology Co., Ltd., Beijing, 100192, China
Zhen Chen & Lei Li

Authors

Miao Wang
View author publications
You can also search for this author in PubMed Google Scholar
Zhen Chen
View author publications
You can also search for this author in PubMed Google Scholar
Chuxi Li
View author publications
You can also search for this author in PubMed Google Scholar
Zhao Yang
View author publications
You can also search for this author in PubMed Google Scholar
Lei Li
View author publications
You can also search for this author in PubMed Google Scholar
Meng Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Shengbing Zhang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Meng Zhang or Shengbing Zhang .

Editor information

Editors and Affiliations

Xiamen University, Xiamen, China
Yongxuan Lai
Beijing Normal University, Zhuhai, China
Tian Wang
Xiamen University, Xiamen, China
Min Jiang
Tianjin University, Tianjin, China
Guangquan Xu
Hunan University, Changsha, China
Wei Liang
University of Naples Parthenope, Naples, Italy
Aniello Castiglione

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Wang, M. et al. (2022). SPACE: Sparsity Propagation Based DCNN Training Accelerator on Edge. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13156. Springer, Cham. https://doi.org/10.1007/978-3-030-95388-1_26

Download citation

DOI: https://doi.org/10.1007/978-3-030-95388-1_26
Published: 23 February 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95387-4
Online ISBN: 978-3-030-95388-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics