Skip to main content

Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling

  • Conference paper
  • First Online:
Artificial Neural Networks and Machine Learning – ICANN 2022 (ICANN 2022)

Abstract

Neural architecture search (NAS) aims to automate architecture design processes and improve the performance of deep neural networks. Platform-aware NAS methods consider both performance and complexity and can find well-performing architectures with low computational resources. Although ordinary NAS methods result in tremendous computational costs owing to the repetition of model training, one-shot NAS, which trains the weights of a supernetwork containing all candidate architectures only once during the search process, has been reported to result in a lower search cost. This study focuses on the architecture complexity-aware one-shot NAS that optimizes the objective function composed of the weighted sum of two metrics, such as the predictive performance and number of parameters. In existing methods, the architecture search process must be run multiple times with different coefficients of the weighted sum to obtain multiple architectures with different complexities. This study aims at reducing the search cost associated with finding multiple architectures. The proposed method uses multiple distributions to generate architectures with different complexities and updates each distribution using the samples obtained from multiple distributions based on importance sampling. The proposed method allows us to obtain multiple architectures with different complexities in a single architecture search, resulting in reducing the search cost. The proposed method is applied to the architecture search of convolutional neural networks on the CIAFR-10 and ImageNet datasets. Consequently, compared with baseline methods, the proposed method finds multiple architectures with varying complexities while requiring less computational effort.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    This utility definition does not assume the possibility of sampling architectures with the same loss value. Although it could happen in our case, we use this utility definition for simplicity. A rigorous definition can be found in [14, 19].

References

  1. Akimoto, Y., Shirakawa, S., Yoshinari, N., Uchida, K., Saito, S., Nishida, K.: Adaptive stochastic natural gradient method for one-shot neural architecture search. In: International Conference on Machine Learning (ICML) (2019)

    Google Scholar 

  2. Amari, S.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)

    Article  Google Scholar 

  3. Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  4. Chu, X., Zhang, B., Li, Q., Xu, R., Li, X.: SCARLET-NAS: bridging the gap between scalability and fairness in neural architecture search. In: ICCV Workshops (2021). https://arxiv.org/abs/1908.06022

  5. Chu, X., Zhou, T., Zhang, B., Li, J.: Fair DARTS: eliminating unfair advantages in differentiable architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 465–480. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_28

    Chapter  Google Scholar 

  6. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)

    Google Scholar 

  7. Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(55), 1–21 (2019)

    MathSciNet  MATH  Google Scholar 

  8. Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32

    Chapter  Google Scholar 

  9. Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

  10. Huang, S., Chu, W.: Searching by generating: flexible and efficient one-shot NAS with architecture generator. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  11. Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto (2009)

    Google Scholar 

  12. Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)

    Google Scholar 

  13. Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  14. Ollivier, Y., Arnold, L., Auger, A., Hansen, N.: Information-geometric optimization algorithms: a unifying picture via invariance principles. J. Mach. Learn. Res. 18(18), 1–65 (2017)

    MathSciNet  MATH  Google Scholar 

  15. Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: International Conference on Machine Learning (ICML) (2018)

    Google Scholar 

  16. Real, E., et al.: Large-scale evolution of image classifiers. In: International Conference on Machine Learning (ICML) (2017)

    Google Scholar 

  17. Saito, S., Shirakawa, S.: Controlling model complexity in probabilistic model-based dynamic optimization of neural network structures. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11728, pp. 393–405. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30484-3_33

    Chapter  Google Scholar 

  18. Shirakawa, S., Akimoto, Y., Ouchi, K., Ohara, K.: Sample reuse in the covariance matrix adaptation evolution strategy based on importance sampling. In: Genetic and Evolutionary Computation Conference (GECCO) (2015)

    Google Scholar 

  19. Shirakawa, S., Akimoto, Y., Ouchi, K., Ohara, K.: Sample Reuse via Importance Sampling in Information Geometric Optimization. arXiv:1805.12388 (2018). https://arxiv.org/abs/1805.12388

  20. Shirakawa, S., Iwata, Y., Akimoto, Y.: Dynamic optimization of neural network structures using probabilistic modeling. In: 32nd AAAI Conference on Artificial Intelligence (AAAI) (2018)

    Google Scholar 

  21. Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Genetic and Evolutionary Computation Conference (GECCO) (2017)

    Google Scholar 

  22. Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  23. Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  24. Wu, B., et al.: FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)

    Google Scholar 

  25. You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: GreedyNAS: towards fast one-shot NAS with greedy supernet. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)

    Google Scholar 

  26. Zhou, P., Xiong, C., Socher, R., Hoi, S.C.H.: Theory-inspired path-regularized differential network architecture search. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 8296–8307 (2020)

    Google Scholar 

  27. Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)

    Google Scholar 

  28. Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by NEDO (JPNP18002), JSPS KAKENHI Grant Number JP20H04240, and JST PRESTO Grant Number JPMJPR2133.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shinichi Shirakawa .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Noda, Y., Saito, S., Shirakawa, S. (2022). Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13532. Springer, Cham. https://doi.org/10.1007/978-3-031-15937-4_51

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-15937-4_51

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-15936-7

  • Online ISBN: 978-3-031-15937-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics