Qtorch+: Next Generation Arithmetic for Pytorch Machine Learning

Ho, Nhut-Minh; De Silva, Himeshi; Gustafson, John L.; Wong, Weng-Fai

doi:10.1007/978-3-031-09779-9_3

Nhut-Minh Ho⁹,
Himeshi De Silva¹⁰,
John L. Gustafson⁹ &
…
Weng-Fai Wong⁹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13253))

Included in the following conference series:

Conference on Next Generation Arithmetic

440 Accesses
2 Citations

Abstract

This paper presents Qtorch+, a tool which enables next generation number formats on Pytorch, a widely popular high-level Deep Learning framework. With hand-crafted GPU accelerated kernels for processing novel number formats, Qtorch+ allows developers and researchers to freely experiment with their choice of cutting-edge number formats for Deep Neural Network (DNN) training and inference. Qtorch+ works seamlessly with Pytorch, one of the most versatile DNN frameworks, with little added effort. At the current stage of development, we not only support the novel posit number format, but also any other arbitrary set of points in the real number domain. Training and inference results show that a vanilla 8-bit format would suffice for training, while a format with 6 bits or less would suffice to run accurate inference for various networks ranging from image classification to natural language processing and generative adversarial networks. Furthermore, the support for arbitrary number sets can contribute towards designing more efficient number formats for inference in the near future. Qtorch+ and tutorials are available on GitHub (https://github.com/minhhn2910/QPyTorch).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 49.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pytorch.org/vision/stable/models.html.
2.
https://huggingface.co.
3.
https://github.com/minhhn2910/conga2022.
4.
Same link as footnote 5.
5.
For Transformer, we had to use P(16,2) for the backward error propagating instead of P(8,2) to achieve convergence.

References

Abdelfattah, A., et al.: A survey of numerical linear algebra methods utilizing mixed-precision arithmetic. Int. J. High Perform. Comput. Appl. 35(4), 344–369 (2021)
Article Google Scholar
Bagherinezhad, H., Rastegari, M., Farhadi, A.: LCNN: lookup-based convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7120–7129 (2017)
Google Scholar
Boo, Y., Sung, W.: Fixed-point optimization of transformer neural network. In: ICASSP 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1753–1757. IEEE (2020)
Google Scholar
Brown, T.B., et al.: Language models are few-shot learners. arXiv preprint arXiv:2005.14165 (2020)
Carmichael, Z., Langroudi, H.F., Khazanov, C., Lillie, J., Gustafson, J.L., Kudithipudi, D.: Performance-efficiency trade-off of low-precision numerical formats in deep neural networks. In: Proceedings of the Conference for Next Generation Arithmetic 2019, pp. 1–9 (2019)
Google Scholar
Chiang, W.F., Baranowski, M., Briggs, I., Solovyev, A., Gopalakrishnan, G., Rakamarić, Z.: Rigorous floating-point mixed-precision tuning. ACM SIGPLAN Not. 52(1), 300–315 (2017)
Article Google Scholar
Cococcioni, M., Ruffaldi, E., Saponara, S.: Exploiting posit arithmetic for deep neural networks in autonomous driving applications. In: 2018 International Conference of Electrical and Electronic Technologies for Automotive, pp. 1–6. IEEE (2018)
Google Scholar
De Silva, H., Santosa, A.E., Ho, N.M., Wong, W.F.: ApproxSymate: path sensitive program approximation using symbolic execution. In: Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, and Tools for Embedded Systems, pp. 148–162 (2019)
Google Scholar
De Silva, H.P.: Software techniques for the measurement, management and reduction of numerica (2020)
Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: 2009 IEEE Conference on Computer Vision and Pattern Recognition, pp. 248–255. IEEE (2009)
Google Scholar
Gaffar, A.A., Mencer, O., Luk, W.: Unifying bit-width optimisation for fixed-point and floating-point designs. In: 12th Annual IEEE Symposium on Field-Programmable Custom Computing Machines, pp. 79–88. IEEE (2004)
Google Scholar
Gawehn, E., Hiss, J.A., Schneider, G.: Deep learning in drug discovery. Mol. Inf. 35(1), 3–14 (2016)
Article Google Scholar
Gustafson, J.L., Yonemoto, I.T.: Beating floating point at its own game: posit arithmetic. Supercomput. Front. Innov. 4(2), 71–86 (2017)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Google Scholar
Ho, N.M., Manogaran, E., Wong, W.F., Anoosheh, A.: Efficient floating point precision tuning for approximate computing. In: 2017 22nd Asia and South Pacific Design Automation Conference (ASP-DAC), pp. 63–68. IEEE (2017)
Google Scholar
Ho, N.M., Nguyen, D.T., Silva, H.D., Gustafson, J.L., Wong, W.F., Chang, I.J.: Posit arithmetic for the training and deployment of generative adversarial networks. In: 2021 Design, Automation Test in Europe Conference Exhibition (DATE), pp. 1350–1355 (2021). https://doi.org/10.23919/DATE51398.2021.9473933
Ho, N.M., Vaddi, R., Wong, W.F.: Multi-objective precision optimization of deep neural networks for edge devices. In: 2019 Design, Automation and Test in Europe Conference and Exhibition (DATE), pp. 1100–1105. IEEE (2019)
Google Scholar
Ho, N.M., Wong, W.F.: Exploiting half precision arithmetic in Nvidia GPUs. In: 2017 IEEE High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2017)
Google Scholar
Ho, N.M., Wong, W.F.: Tensorox: accelerating GPU applications via neural approximation on unused tensor cores. IEEE Trans. Parallel Distrib. Syst. 33(2), 429–443 (2021)
Article Google Scholar
Klöwer, M., Düben, P.D., Palmer, T.N.: Posits as an alternative to floats for weather and climate models. In: Proceedings of the Conference for Next Generation Arithmetic 2019, pp. 1–8 (2019)
Google Scholar
Krishnamoorthi, R., James, R., Min, N., Chris, G., Seth, W.: Introduction to quantization on PyTorch (2020). https://pytorch.org/blog/introduction-to-quantization-on-pytorch/
Langroudi, H.F., Carmichael, Z., Gustafson, J.L., Kudithipudi, D.: Positnn framework: tapered precision deep learning inference for the edge. In: 2019 IEEE Space Computing Conference (SCC), pp. 53–59. IEEE (2019)
Google Scholar
Langroudi, H.F., Karia, V., Gustafson, J.L., Kudithipudi, D.: Adaptive posit: parameter aware numerical format for deep learning inference on the edge. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pp. 726–727 (2020)
Google Scholar
LeCun, Y., et al.: Backpropagation applied to handwritten zip code recognition. Neural Comput. 1(4), 541–551 (1989)
Article Google Scholar
LeCun, Y., et al.: Lenet-5, convolutional neural networks. 20(5), 14 (2015). http://yann.lecun.com/exdb/lenet
Lo, C.Y., Lau, F.C., Sham, C.W.: Fixed-point implementation of convolutional neural networks for image classification. In: 2018 International Conference on Advanced Technologies for Communications (ATC), pp. 105–109. IEEE (2018)
Google Scholar
Lu, J., Fang, C., Xu, M., Lin, J., Wang, Z.: Evaluations on deep neural networks training using posit number system. IEEE Trans. Comput. 70(2), 174–187 (2020)
Article Google Scholar
Mattson, P., et al.: MLPerf: an industry standard benchmark suite for machine learning performance. IEEE Micro 40(2), 8–16 (2020)
Article Google Scholar
Micikevicius, P., et al.: Mixed precision training. arXiv preprint arXiv:1710.03740 (2017)
Nvidia: Scaling Language Model Training to a Trillion Parameters Using Megatron. https://developer.nvidia.com/blog/scaling-language-model-training-to-a-trillion-parameters-using-megatron/ (2021). Accessed 03 Jan 2022
Oord, A.V.D., et al.: WaveNet: a generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016)
Posithub.org: Posit Standard Documentation Release 3.2-draft. https://posithub.org/docs/posit_standard.pdf (2018). Accessed 03 Jan 2022
Pytorch: Pytorch module (2021). https://pytorch.org/docs/stable/generated/torch.nn.Module.html#torch.nn.Module.register_forward_pre_hook
Ramanathan, A.K., et al.: Look-up table based energy efficient processing in cache support for neural network acceleration. In: 2020 53rd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 88–101. IEEE (2020)
Google Scholar
Reddi, V.J., et al.: MLPerf inference benchmark. In: 2020 ACM/IEEE 47th Annual International Symposium on Computer Architecture (ISCA), pp. 446–459. IEEE (2020)
Google Scholar
Ryan, J., Lin, M.J., Miikkulainen, R.: Intrusion detection with neural networks. Adv. Neural Inf. Process. Syst. 943–949 (1998)
Google Scholar
Sigtia, S., Benetos, E., Dixon, S.: An end-to-end neural network for polyphonic piano music transcription. IEEE/ACM Trans. Audio Speech Lang. Process. 24(5), 927–939 (2016)
Article Google Scholar
Solovyev, R., Kustov, A., Telpukhov, D., Rukhlov, V., Kalinin, A.: Fixed-point convolutional neural network for real-time video processing in FPGA. In: 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EIConRus), pp. 1605–1611. IEEE (2019)
Google Scholar
Sordo, M.: Introduction to neural networks in healthcare. Knowledge Management for Medical Care, Open Clinical (2002)
Google Scholar
Sun, X., et al.: Hybrid 8-bit floating point (HFP8) training and inference for deep neural networks (2019)
Google Scholar
Sun, X., et al.: Ultra-low precision 4-bit training of deep neural networks. Adv. Neural. Inf. Process. Syst. 33, 1796–1807 (2020)
Google Scholar
Tobiyama, S., Yamaguchi, Y., Shimada, H., Ikuse, T., Yagi, T.: Malware detection with deep neural network using process behavior. In: 2016 IEEE 40th Annual Computer Software and Applications Conference (COMPSAC), vol. 2, pp. 577–582. IEEE (2016)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems, pp. 5998–6008 (2017)
Google Scholar
Wang, N., Choi, J., Brand, D., Chen, C.Y., Gopalakrishnan, K.: Training deep neural networks with 8-bit floating point numbers. In: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 7686–7695 (2018)
Google Scholar
Wang, X., et al.: ESRGAN: enhanced super-resolution generative adversarial networks. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshops (2018)
Google Scholar
Wu, H., Judd, P., Zhang, X., Isaev, M., Micikevicius, P.: Integer quantization for deep learning inference: principles and empirical evaluation. arXiv preprint arXiv:2004.09602 (2020)
Yazdanbakhsh, A., Park, J., Sharma, H., Lotfi-Kamran, P., Esmaeilzadeh, H.: Neural acceleration for GPU throughput processors. In: Proceedings of the 48th International Symposium on Microarchitecture, pp. 482–493 (2015)
Google Scholar
Zhang, T., Lin, Z., Yang, G., De Sa, C.: QPyTorch: a low-precision arithmetic simulation framework. arXiv preprint arXiv:1910.04540 (2019)
Zhu, J.Y., Park, T., Isola, P., Efros, A.A.: Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2223–2232 (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

National University of Singapore, Singapore, Singapore
Nhut-Minh Ho, John L. Gustafson & Weng-Fai Wong
Agency for Science, Technology and Research (A*STAR), Singapore, Singapore
Himeshi De Silva

Authors

Nhut-Minh Ho
View author publications
You can also search for this author in PubMed Google Scholar
Himeshi De Silva
View author publications
You can also search for this author in PubMed Google Scholar
John L. Gustafson
View author publications
You can also search for this author in PubMed Google Scholar
Weng-Fai Wong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Nhut-Minh Ho .

Editor information

Editors and Affiliations

School of Computing, National University of Singapore, Singapore, Singapore
John Gustafson
University of Calgary, Calgary, AB, Canada
Vassil Dimitrov

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ho, NM., De Silva, H., Gustafson, J.L., Wong, WF. (2022). Qtorch+: Next Generation Arithmetic for Pytorch Machine Learning. In: Gustafson, J., Dimitrov, V. (eds) Next Generation Arithmetic. CoNGA 2022. Lecture Notes in Computer Science, vol 13253. Springer, Cham. https://doi.org/10.1007/978-3-031-09779-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-031-09779-9_3
Published: 14 July 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-09778-2
Online ISBN: 978-3-031-09779-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Qtorch+: Next Generation Arithmetic for Pytorch Machine Learning