Abstract
On-edge learning enables edge devices to continually adapt to the new data of AI applications. However, much more computing capacity and memory space are needed to achieve the batch backward propagation oriented training, in which the system power budget is limited by the edge circumstance. As the operands are propagated during the training, useless zero values are inevitably propagated, which will cause unnecessary waste of memory accesses and computations. This paper conducted a thorough analysis of the origin of sparsity in all three phases of the training based on sparse propagation and gives three insights about the absolute sparsity and the nonabsolute sparsity found for efficient deployment of the training process. An efficient training accelerator named SPACE which can not only reduce memory footprint but also delete a massive amount of computations by exploiting the nonabsolute sparsity and the absolute sparsity is proposed. SPACE can improve performance and energy efficiency by a factor of 3.2x and 2.8x, respectively, compared with dense training architecture.
Supported by The Laboratory Open Fund of Beijing Smart-chip Microelectronics Technology Co., Ltd. grant number SGTYHT/20-JS-221.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. In: NIPS (2012)
Amodei, D., Ananthanarayanan, S., Anubhai, R., Bai, J., Zhu, Z.: Deep Speech 2: end-to-end speech recognition in English and Mandarin. Computer Science (2015)
Parashar, A., et al.: SCNN: an accelerator for compressed-sparse convolutional neural networks. In: International Symposium (2017)
Howard, A.G., et al.: MobileNets: efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861 (2017)
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. arXiv:1806.09055 (2018)
Choi, S., Sim, J., Kang, M., et al.: An energy-efficient deep convolutional neural network training accelerator for in situ personalization on smart devices. IEEE J. Solid-State Circuits 55, 2691–2702 (2020)
Zhe, Y., Yue, J., Yang, H., et al.: Sticker: a 0.41-62.1 TOPS/W 8bit neural network processor with multi-sparsity compatible convolution arrays and online tuning acceleration for fully connected layers. In: 2018 IEEE Symposium on VLSI Circuits. IEEE (2018)
Lee, G., et al.: Acceleration of DNN backward propagation by selective computation of gradients. In: DAC 2019. ACM (2019)
Wang, M., Fan, X., Zhang, W., et al.: Balancing memory-accessing and computing over sparse DNN accelerator via efficient data packaging. J. Syst. Architect. 117(1), 102094 (2021)
Chen, Y.H., Emer, J., Sze, V.: Eyeriss: a spatial architecture for energy-efficient dataflow for convolutional neural networks (2016)
Li, C., Fan, X., Geng, Y., et al.: ENAS oriented layer adaptive data scheduling strategy for resource limited hardware. Neurocomputing 381(1), 29–39 (2019)
Zhao, Y., Chen, X., Wang, Y., Li, C., Lin, Y.: SmartExchange: trading higher-cost memory storage/access for lower-cost computation. In: 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA) (2020)
Yang, X., et al.: Interstellar: Using Halide’s scheduling language to analyze DNN accelerators. In: Larus, J.R., Ceze, L., Strauss, K. (eds.) ASPLOS 2020: Architectural Support for Programming Languages and Operating Systems, Lausanne, Switzerland, 16–20 March 2020. ACM (2020)
Mahmoud, M., Edo, I., Zadeh, A.H., et al.: TensorDash: exploiting sparsity to accelerate deep neural network training and inference (2020)
Liu, L., et al.: DUET: boosting deep neural network efficiency on dual-module architecture. In: MICRO 2020, pp. 738–750 (2020)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Wang, M. et al. (2022). SPACE: Sparsity Propagation Based DCNN Training Accelerator on Edge. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13156. Springer, Cham. https://doi.org/10.1007/978-3-030-95388-1_26
Download citation
DOI: https://doi.org/10.1007/978-3-030-95388-1_26
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95387-4
Online ISBN: 978-3-030-95388-1
eBook Packages: Computer ScienceComputer Science (R0)