Skip to main content

Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT

  • Conference paper
  • First Online:
Computer Vision – ECCV 2022 (ECCV 2022)

Abstract

The lightweight nature of IoT devices makes it challenging to run deep neural networks (DNNs) locally for applications like augmented reality. Recent advances in IoT communication like LTE-M have significantly boosted the link bandwidth, enabling IoT devices to stream visual data to edge servers running DNNs for inference. However, uncompressed visual data can still easily overload the IoT link, and the wireless spectrum is shared by numerous IoT devices, causing unstable link bandwidth. Mainstream codecs can reduce the traffic but at the cost of severe inference accuracy drops. Recent works on differentiable JPEG train the codec to tackle the damage to inference accuracy. But they rely on heuristic configurations in the loss function to balance the rate-accuracy tradeoff, providing no guarantee to meet the IoT bandwidth constraint. This paper presents AutoJPEG, a bandwidth-aware adaptive compression solution that learns the JPEG encoding parameters to optimize the DNN inference accuracy under bandwidth constraints. We model the compressed image size as a closed-form function of encoding parameters by analyzing the JPEG codec workflow. Furthermore, we formulate a constrained optimization framework to minimize the original DNN loss while ensuring the image size strictly meets the bandwidth constraint. Our evaluation validates AutoJPEG on various DNN models and datasets. In our experiments, AutoJPEG outperforms the mainstream codecs (like JPEG and WebP) and the state-of-the-art solutions that optimize the image codec for DNN inference.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Although the max function is also non-differentiable, gradients can still pass through the max function during backward propagation (just like ReLU).

References

  1. Bengio, Y., Léonard, N., Courville, A.: Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation. arXiv preprint arXiv:1308.3432 (2013)

  2. Choi, J., Han, B.: Task-aware quantization network for JPEG image compression. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12365, pp. 309–324. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58565-5_19

    Chapter  Google Scholar 

  3. Cordts, M., et al.: The cityscapes dataset for semantic urban scene understanding. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3213–3223 (2016). https://www.cityscapes-dataset.com/

  4. Courbariaux, M., Bengio, Y., David, J.P.: BinaryConnect: training deep neural networks with binary weights during propagations. In: Advances in Neural Information Processing Systems (2015)

    Google Scholar 

  5. Courbariaux, M., Hubara, I., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks: training deep neural networks with weights and activations constrained to + 1 or -1. arXiv preprint arXiv:1602.02830 (2016)

  6. Fairman, H., Brill, M., Hemmendinger, H.: How the cie 1931 color-matching functions were derived from wright-guild data. Color. Res. Appl. 22(1), 11–23 (1997)

    Article  Google Scholar 

  7. Google: an image format for the web (2021). https://developers.google.com/speed/webp/

  8. He, K., Zhang, X., Ren, S., Sun, J.: Deep Residual Learning for Image Recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)

    Google Scholar 

  9. He, Y., Kang, G., Dong, X., Fu, Y., Yang, Y.: Soft filter pruning for accelerating deep convolutional neural networks. In: Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence, IJCAI-18, pp. 2234–2240. International Joint Conferences on Artificial Intelligence Organization (2018). https://doi.org/10.24963/ijcai.2018/309

  10. He, Y., Lin, J., Liu, Z., Wang, H., Li, L.-J., Han, S.: AMC: AutoML for model compression and acceleration on mobile devices. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 815–832. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_48

    Chapter  Google Scholar 

  11. He, Y., Zhang, X., Sun, J.: Channel pruning for accelerating very deep neural networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1389–1397 (2017)

    Google Scholar 

  12. Hiriart-Urruty, J.B., Lemaréchal, C.: Convex analysis and minimization algorithms I: fundamentals, vol. 305. Springer Science & Business Media (2013). https://doi.org/10.1007/978-3-662-02796-7

  13. Hoglund, A., et al.: Overview of 3GPP release 14 further enhanced MTC. IEEE Commun. Stand. Mag. 2(2), 84–89 (2018)

    Article  Google Scholar 

  14. Hoglund, A., et al.: Overview of 3GPP release 14 enhanced NB-IoT. IEEE Network 31(6), 16–22 (2017)

    Article  Google Scholar 

  15. Hoymann, C., et al.: LTE release 14 outlook. IEEE Commun. Mag. 54(6), 44–49 (2016)

    Article  Google Scholar 

  16. Hu, C., Bao, W., Wang, D., Liu, F.: Dynamic adaptive dnn surgery for inference acceleration on the edge. In: IEEE INFOCOM 2019-IEEE Conference on Computer Communications, pp. 1423–1431. IEEE (2019)

    Google Scholar 

  17. Hubara, I., Courbariaux, M., Soudry, D., El-Yaniv, R., Bengio, Y.: Binarized neural networks. In: Advances in Neural Information Processing Systems (2016)

    Google Scholar 

  18. ITUR: BT 601: studio encoding parameters of digital television for standard 4: 3 and wide-screen 16: 9 aspect ratios. ITU-R Rec. BT 656 (1995)

    Google Scholar 

  19. Jacob, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2704–2713 (2018)

    Google Scholar 

  20. Kang, Y., et al.: Neurosurgeon: collaborative intelligence between the cloud and mobile edge. ACM SIGARCH Comput. Archit. News 45(1), 615–629 (2017)

    Article  Google Scholar 

  21. Krizhevsky, A., et al.: Learning Multiple Layers of Features from Tiny Images. Tech. rep., University of Toronto (2009). https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz

  22. Li, F., Zhang, B., Liu, B.: Ternary Weight Networks. arXiv preprint arXiv:1605.04711 (2016)

  23. Li, H., Kadav, A., Durdanovic, I., Samet, H., Graf, P.H.: Pruning filters for efficient convNets. In: International Conference on Learning Representations (2017)

    Google Scholar 

  24. Lin, M., et al.: HRank: filter pruning using high-rank feature map. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 1529–1538 (2020)

    Google Scholar 

  25. Liu, Z., et al.: DeepN-JPEG: a deep neural network favorable JPEG-based image compression framework. In: Proceedings of the 55th Annual Design Automation Conference, pp. 1–6 (2018)

    Google Scholar 

  26. Luo, X., Talebi, H., Yang, F., Elad, M., Milanfar, P.: The rate-distortion-accuracy tradeoff: JPEG case study. arXiv preprint arXiv:2008.00605 (2020)

  27. Ma, X., et al.: PCONV: the missing but desirable sparsity in dnn weight pruning for real-time execution on mobile devices. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, pp. 5117–5124 (2020)

    Google Scholar 

  28. Peng, H., Wu, J., Chen, S., Huang, J.: Collaborative channel pruning for deep networks. In: International Conference on Machine Learning, pp. 5113–5122 (2019)

    Google Scholar 

  29. Rastegari, M., Ordonez, V., Redmon, J., Farhadi, A.: XNOR-Net: imagenet classification using binary convolutional neural networks. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9908, pp. 525–542. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46493-0_32

    Chapter  Google Scholar 

  30. Richardson, I.E.: H. 264 and MPEG-4 video compression: video coding for next-generation multimedia. John Wiley & Sons (2004)

    Google Scholar 

  31. Roelofs, G., Koman, R.: PNG: the definitive guide. O’Reilly & Associates, Inc. (1999)

    Google Scholar 

  32. Shin, R., Song, D.: JPEG-resistant adversarial images. In: NIPS 2017 Workshop on Machine Learning and Computer Security, vol. 1 (2017)

    Google Scholar 

  33. Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014)

  34. Wallace, G.K.: The JPEG still picture compression standard. In: IEEE Transactions on Consumer Electronics (1992)

    Google Scholar 

  35. Wang, Y., et al.: Pruning from scratch. In: AAAI Conference on Artificial Intelligence (2020)

    Google Scholar 

  36. Xie, X., Kim, K.H.: Source compression with bounded dnn perception loss for IoT edge computer vision. In: The 25th Annual International Conference on Mobile Computing and Networking, pp. 1–16 (2019)

    Google Scholar 

  37. Yang, H., Zhu, Y., Liu, J.: ECC: platform-independent energy-constrained deep neural network compression via a bilinear regression model. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 11206–11215 (2019)

    Google Scholar 

  38. Yang, J., et al.: Quantization networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 7308–7316 (2019)

    Google Scholar 

  39. Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480 (2017)

    Google Scholar 

  40. Zhang, C., Karjauv, A., Benz, P., Kweon, I.S.: Towards robust data hiding against (jpeg) compression: a pseudo-differentiable deep learning approach. arXiv preprint arXiv:2101.00973 (2020)

  41. Zhu, C., Han, S., Mao, H., Dally, W.J.: Trained ternary quantization. arXiv preprint arXiv:1612.01064 (2016)

  42. Zhuang, Z., et al.: Discrimination-aware channel pruning for deep neural networks. In: Advances in Neural Information Processing Systems, pp. 875–886 (2018)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiufeng Xie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Xie, X., Zhou, N., Zhu, W., Liu, J. (2022). Bandwidth-Aware Adaptive Codec for DNN Inference Offloading in IoT. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds) Computer Vision – ECCV 2022. ECCV 2022. Lecture Notes in Computer Science, vol 13698. Springer, Cham. https://doi.org/10.1007/978-3-031-19839-7_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-19839-7_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-19838-0

  • Online ISBN: 978-3-031-19839-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics