Generative Low-Bitwidth Data Free Quantization

Xu, Shoukai; Li, Haokun; Zhuang, Bohan; Liu, Jing; Cao, Jiezhang; Liang, Chuangrun; Tan, Mingkui

doi:10.1007/978-3-030-58610-2_1

Shoukai Xu^12,13,
Haokun Li¹²,
Bohan Zhuang¹⁴,
Jing Liu¹²,
Jiezhang Cao¹²,
Chuangrun Liang¹² &
…
Mingkui Tan¹²

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 12357))

Included in the following conference series:

European Conference on Computer Vision

4912 Accesses
41 Citations

Abstract

Neural network quantization is an effective way to compress deep models and improve their execution latency and energy efficiency, so that they can be deployed on mobile or embedded devices. Existing quantization methods require original data for calibration or fine-tuning to get better performance. However, in many real-world scenarios, the data may not be available due to confidential or private issues, thereby making existing quantization methods not applicable. Moreover, due to the absence of original data, the recently developed generative adversarial networks (GANs) cannot be applied to generate data. Although the full-precision model may contain rich data information, such information alone is hard to exploit for recovering the original data or generating new meaningful data. In this paper, we investigate a simple-yet-effective method called Generative Low-bitwidth Data Free Quantization (GDFQ) to remove the data dependence burden. Specifically, we propose a knowledge matching generator to produce meaningful fake data by exploiting classification boundary knowledge and distribution information in the pre-trained model. With the help of generated data, we can quantize a model by learning knowledge from the pre-trained model. Extensive experiments on three data sets demonstrate the effectiveness of our method. More critically, our method achieves much higher accuracy on 4-bit quantization than the existing data free quantization method. Code is available at https://github.com/xushoukai/GDFQ.

S. Xu, H. Li and B. Zhuang—Authors Contributed Equally.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
https://pypi.org/project/pytorchcv/.

References

Abadi, M., et al.: Tensorflow: a system for large-scale machine learning. In: 12th \(\{\)USENIX\(\}\) Symposium on Operating Systems Design and Implementation (\(\{\)OSDI\(\}\) 16), pp. 265–283 (2016)
Google Scholar
Banner, R., Nahshan, Y., Hoffer, E., Soudry, D.: Aciq: analytical clipping for integer quantization of neural networks (2018)
Google Scholar
Banner, R., Nahshan, Y., Soudry, D.: Post training 4-bit quantization of convolutional networks for rapid-deployment. In: Proceedings of Advance Neural Information Processing System (2019)
Google Scholar
Cai, Y., Yao, Z., Dong, Z., Gholami, A., Mahoney, M.W., Keutzer, K.: Zeroq: a novel zero shot quantization framework. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (2020)
Google Scholar
Cai, Z., He, X., Sun, J., Vasconcelos, N.: Deep learning with low precision by half-wave gaussian quantization. In: Proceedings of IEEE Conference on Computer Vision Pattern Recognition (2017)
Google Scholar
Cao, J., Guo, Y., Wu, Q., Shen, C., Tan, M.: Adversarial learning with local coordinate coding. In: Proceedings of International Conference Machine Learning (2018)
Google Scholar
Cao, J., Mo, L., Zhang, Y., Jia, K., Shen, C., Tan, M.: Multi-marginal wasserstein gan. In: Proceedings of Advances in Neural Information Processing Systems (2019)
Google Scholar
Chen, H., et al.: Data-free learning of student networks. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Google Scholar
Chen, T., et al.: Mxnet: a flexible and efficient machine learning library for heterogeneous distributed systems. arXiv preprint arXiv:1512.01274 (2015)
Choi, J., Wang, Z., Venkataramani, S., Chuang, P.I.J., Srinivasan, V., Gopalakrishnan, K.: Pact: Parameterized clipping activation for quantized neural networks. arXiv preprint arXiv:1805.06085 (2018)
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: Imagenet: a large-scale hierarchical image database. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Devlin, J., Chang, M.W., Lee, K., Toutanova, K.: Bert: pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018)
Esser, S.K., McKinstry, J.L., Bablani, D., Appuswamy, R., Modha, D.S.: Learned step size quantization. In: Proceedings of International Conference on Learning Representations (2020)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Proceedings of Advances in Neural Information Processing Systems (2014)
Google Scholar
Guo, Y., et al.: Nat: neural architecture transformer for accurate and compact architectures. In: Proceedings of Advances in Neural Information Processing Systems (2019)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Hinton, G.E., Vinyals, O., Dean, J.: Distilling the knowledge in a neural network. arXiv:1503.02531 (2015)
Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: Proceedings of Advances in Neural Information Processing Systems (2016)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift. arXiv preprint arXiv:1502.03167 (2015)
Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Jung, S., et al.: Learning to quantize deep networks by optimizing quantization intervals with task loss. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) Proceedings of International Conference on Learning Representations (2015)
Google Scholar
Krizhevsky, A., Hinton, G., et al.: Learning multiple layers of features from tiny images (2009)
Google Scholar
Lin, J., Gan, C., Han, S.: Defensive quantization: when efficiency meets robustness. arXiv preprint arXiv:1904.08444 (2019)
Lopes, R.G., Fenu, S., Starner, T.: Data-free knowledge distillation for deep neural networks. arXiv preprint arXiv:1710.07535 (2017)
Louizos, C., Reisser, M., Blankevoort, T., Gavves, E., Welling, M.: Relaxed quantization for discretized neural networks. In: Proceedings of International Conference on Learning Representations (2019)
Google Scholar
Micaelli, P., Storkey, A.: Zero-shot knowledge transfer via adversarial belief matching. arXiv:1905.09768 (2019)
Mikolov, T., Karafiát, M., Burget, L., Černockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: Conference of the International Speech Communication Association (ISCA) (2010)
Google Scholar
Miyato, T., Kataoka, T., Koyama, M., Yoshida, Y.: Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957 (2018)
Nagel, M., Baalen, M.v., Blankevoort, T., Welling, M.: Data-free quantization through weight equalization and bias correction. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Google Scholar
Nayak, G.K., Mopuri, K.R., Shaj, V., Babu, R.V., Chakraborty, A.: Zero-shot knowledge distillation in deep networks. In: Proceedings of the International Conference on Machine Learning (2019)
Google Scholar
Nesterov, Y.E.: A method for solving the convex programming problem with convergence rate o (1/k\(^{\wedge }\) 2). In: Proceedings of the USSR Academy of Sciences, vol. 269, pp. 543–547 (1983)
Google Scholar
Odena, A., Olah, C., Shlens, J.: Conditional image synthesis with auxiliary classifier GANs. In: Proceedings of International Conference on Machine Learning (2017)
Google Scholar
Paszke, A., Gross, S., Chintala, S., Chanan, G.: Pytorch: tensors and dynamic neural networks in python with strong GPU acceleration. PyTorch: Tensors Dynamic Neural Networks in Python with strong GPU Acceleration, 6 (2017)
Google Scholar
Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32
Chapter Google Scholar
Sak, H., Senior, A.W., Beaufays, F.: Long short-term memory recurrent neural network architectures for large scale acoustic modeling. In: Conference of the International Speech Communication Association (ISCA), pp. 338–342 (2014)
Google Scholar
Sandler, M., Howard, A., Zhu, M., Zhmoginov, A., Chen, L.C.: Mobilenetv 2: inverted residuals and linear bottlenecks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) Proceedings of International Conference on Learning Representations (2015)
Google Scholar
Sung, W., Shin, S., Hwang, K.: Resiliency of deep neural networks under quantization. arXiv preprint arXiv:1511.06488 (2015)
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2016)
Google Scholar
Yang, J., et al.: Quantization networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar
Yoo, J., Cho, M., Kim, T., Kang, U.: Knowledge extraction with no observable data. In: Proceedings of Advances in Neural Information Processing Systems, pp. 2701–2710 (2019)
Google Scholar
Zeng, R., et al.: Graph convolutional networks for temporal action localization. In: Proceedings of the IEEE International Conference on Computer Vision (2019)
Google Scholar
Zhang, C., Bengio, S., Hardt, M., Recht, B., Vinyals, O.: Understanding deep learning requires rethinking generalization. In: Proceedings of International Conference on Learning Representations (2017)
Google Scholar
Zhang, D., Yang, J., Ye, D., Hua, G.: LQ-Nets: learned quantization for highly accurate and compact deep neural networks. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11212, pp. 373–390. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01237-3_23
Chapter Google Scholar
Zhang, Y., et al.: From whole slide imaging to microscopy: deep microscopy adaptation network for histopathology cancer image classification. In: Shen, D., et al. (eds.) MICCAI 2019. LNCS, vol. 11764, pp. 360–368. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-32239-7_40
Chapter Google Scholar
Zhang, Y., Zhao, P., Wu, Q., Li, B., Huang, J., Tan, M.: Cost-sensitive portfolio selection via deep reinforcement learning. IEEE Trans. Knowl. Data Eng. (2020)
Google Scholar
Zhao, R., Hu, Y., Dotzel, J., De Sa, C., Zhang, Z.: Improving neural network quantization without retraining using outlier channel splitting. In: Proceedings of the International Conference on Machine Learning (2019)
Google Scholar
Zhou, S., Wu, Y., Ni, Z., Zhou, X., Wen, H., Zou, Y.: Dorefa-net: training low bitwidth convolutional neural networks with low bitwidth gradients. arXiv preprint arXiv:1606.06160 (2016)
Zhuang, B., Liu, L., Tan, M., Shen, C., Reid, I.: Training quantized neural networks with a full-precision auxiliary module. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Zhuang, B., Shen, C., Tan, M., Liu, L., Reid, I.: Towards effective low-bitwidth convolutional neural networks. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2018)
Google Scholar
Zhuang, B., Shen, C., Tan, M., Liu, L., Reid, I.: Structured binary neural networks for accurate image classification and semantic segmentation. In: Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (2019)
Google Scholar

Download references

Acknowledgements

This work was partially supported by the Key-Area Research and Development Program of Guangdong Province 2018B010107001, Program for Guangdong Introducing Innovative and Entrepreneurial Teams 2017ZT07X183, Fundamental Research Funds for the Central Universities D2191240.

Author information

Authors and Affiliations

South China University of Technology, Guangzhou, China
Shoukai Xu, Haokun Li, Jing Liu, Jiezhang Cao, Chuangrun Liang & Mingkui Tan
PengCheng Laboratory, Shenzhen, China
Shoukai Xu
Monash University, Melbourne, Australia
Bohan Zhuang

Authors

Shoukai Xu
View author publications
You can also search for this author in PubMed Google Scholar
Haokun Li
View author publications
You can also search for this author in PubMed Google Scholar
Bohan Zhuang
View author publications
You can also search for this author in PubMed Google Scholar
Jing Liu
View author publications
You can also search for this author in PubMed Google Scholar
Jiezhang Cao
View author publications
You can also search for this author in PubMed Google Scholar
Chuangrun Liang
View author publications
You can also search for this author in PubMed Google Scholar
Mingkui Tan
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Mingkui Tan .

Editor information

Editors and Affiliations

University of Oxford, Oxford, UK
Andrea Vedaldi
Graz University of Technology, Graz, Austria
Horst Bischof
University of Freiburg, Freiburg im Breisgau, Germany
Thomas Brox
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Jan-Michael Frahm

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 137 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Xu, S. et al. (2020). Generative Low-Bitwidth Data Free Quantization. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, JM. (eds) Computer Vision – ECCV 2020. ECCV 2020. Lecture Notes in Computer Science(), vol 12357. Springer, Cham. https://doi.org/10.1007/978-3-030-58610-2_1

Download citation

DOI: https://doi.org/10.1007/978-3-030-58610-2_1
Published: 07 October 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-58609-6
Online ISBN: 978-3-030-58610-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics