Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling

Noda, Yuhei; Saito, Shota; Shirakawa, Shinichi

doi:10.1007/978-3-031-15937-4_51

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13532))

Included in the following conference series:

International Conference on Artificial Neural Networks

1836 Accesses
1 Citations

Abstract

Neural architecture search (NAS) aims to automate architecture design processes and improve the performance of deep neural networks. Platform-aware NAS methods consider both performance and complexity and can find well-performing architectures with low computational resources. Although ordinary NAS methods result in tremendous computational costs owing to the repetition of model training, one-shot NAS, which trains the weights of a supernetwork containing all candidate architectures only once during the search process, has been reported to result in a lower search cost. This study focuses on the architecture complexity-aware one-shot NAS that optimizes the objective function composed of the weighted sum of two metrics, such as the predictive performance and number of parameters. In existing methods, the architecture search process must be run multiple times with different coefficients of the weighted sum to obtain multiple architectures with different complexities. This study aims at reducing the search cost associated with finding multiple architectures. The proposed method uses multiple distributions to generate architectures with different complexities and updates each distribution using the samples obtained from multiple distributions based on importance sampling. The proposed method allows us to obtain multiple architectures with different complexities in a single architecture search, resulting in reducing the search cost. The proposed method is applied to the architecture search of convolutional neural networks on the CIAFR-10 and ImageNet datasets. Consequently, compared with baseline methods, the proposed method finds multiple architectures with varying complexities while requiring less computational effort.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
This utility definition does not assume the possibility of sampling architectures with the same loss value. Although it could happen in our case, we use this utility definition for simplicity. A rigorous definition can be found in [14, 19].

References

Akimoto, Y., Shirakawa, S., Yoshinari, N., Uchida, K., Saito, S., Nishida, K.: Adaptive stochastic natural gradient method for one-shot neural architecture search. In: International Conference on Machine Learning (ICML) (2019)
Google Scholar
Amari, S.: Natural gradient works efficiently in learning. Neural Comput. 10(2), 251–276 (1998)
Article Google Scholar
Cai, H., Zhu, L., Han, S.: ProxylessNAS: direct neural architecture search on target task and hardware. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Chu, X., Zhang, B., Li, Q., Xu, R., Li, X.: SCARLET-NAS: bridging the gap between scalability and fairness in neural architecture search. In: ICCV Workshops (2021). https://arxiv.org/abs/1908.06022
Chu, X., Zhou, T., Zhang, B., Li, J.: Fair DARTS: eliminating unfair advantages in differentiable architecture search. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12360, pp. 465–480. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58555-6_28
Chapter Google Scholar
Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2009)
Google Scholar
Elsken, T., Metzen, J.H., Hutter, F.: Neural architecture search: a survey. J. Mach. Learn. Res. 20(55), 1–21 (2019)
MathSciNet MATH Google Scholar
Guo, Z., et al.: Single path one-shot neural architecture search with uniform sampling. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12361, pp. 544–560. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58517-4_32
Chapter Google Scholar
Hu, J., Shen, L., Sun, G.: Squeeze-and-excitation networks. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar
Huang, S., Chu, W.: Searching by generating: flexible and efficient one-shot NAS with architecture generator. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Google Scholar
Krizhevsky, A.: Learning multiple layers of features from tiny images. Technical report, Department of Computer Science, University of Toronto (2009)
Google Scholar
Liu, H., Simonyan, K., Yang, Y.: DARTS: differentiable architecture search. In: International Conference on Learning Representations (ICLR) (2019)
Google Scholar
Loshchilov, I., Hutter, F.: SGDR: stochastic gradient descent with warm restarts. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Ollivier, Y., Arnold, L., Auger, A., Hansen, N.: Information-geometric optimization algorithms: a unifying picture via invariance principles. J. Mach. Learn. Res. 18(18), 1–65 (2017)
MathSciNet MATH Google Scholar
Pham, H., Guan, M.Y., Zoph, B., Le, Q.V., Dean, J.: Efficient neural architecture search via parameter sharing. In: International Conference on Machine Learning (ICML) (2018)
Google Scholar
Real, E., et al.: Large-scale evolution of image classifiers. In: International Conference on Machine Learning (ICML) (2017)
Google Scholar
Saito, S., Shirakawa, S.: Controlling model complexity in probabilistic model-based dynamic optimization of neural network structures. In: Tetko, I.V., Kůrková, V., Karpov, P., Theis, F. (eds.) ICANN 2019. LNCS, vol. 11728, pp. 393–405. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30484-3_33
Chapter Google Scholar
Shirakawa, S., Akimoto, Y., Ouchi, K., Ohara, K.: Sample reuse in the covariance matrix adaptation evolution strategy based on importance sampling. In: Genetic and Evolutionary Computation Conference (GECCO) (2015)
Google Scholar
Shirakawa, S., Akimoto, Y., Ouchi, K., Ohara, K.: Sample Reuse via Importance Sampling in Information Geometric Optimization. arXiv:1805.12388 (2018). https://arxiv.org/abs/1805.12388
Shirakawa, S., Iwata, Y., Akimoto, Y.: Dynamic optimization of neural network structures using probabilistic modeling. In: 32nd AAAI Conference on Artificial Intelligence (AAAI) (2018)
Google Scholar
Suganuma, M., Shirakawa, S., Nagao, T.: A genetic programming approach to designing convolutional neural network architectures. In: Genetic and Evolutionary Computation Conference (GECCO) (2017)
Google Scholar
Szegedy, C., Vanhoucke, V., Ioffe, S., Shlens, J., Wojna, Z.: Rethinking the inception architecture for computer vision. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Tan, M., et al.: MnasNet: platform-aware neural architecture search for mobile. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
Wu, B., et al.: FBNet: hardware-aware efficient ConvNet design via differentiable neural architecture search. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
Google Scholar
You, S., Huang, T., Yang, M., Wang, F., Qian, C., Zhang, C.: GreedyNAS: towards fast one-shot NAS with greedy supernet. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
Google Scholar
Zhou, P., Xiong, C., Socher, R., Hoi, S.C.H.: Theory-inspired path-regularized differential network architecture search. In: Advances in Neural Information Processing Systems (NeurIPS), vol. 33, pp. 8296–8307 (2020)
Google Scholar
Zoph, B., Le, Q.V.: Neural architecture search with reinforcement learning. In: International Conference on Learning Representations (ICLR) (2017)
Google Scholar
Zoph, B., Vasudevan, V., Shlens, J., Le, Q.V.: Learning transferable architectures for scalable image recognition. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
Google Scholar

Download references

Acknowledgments

This work was partially supported by NEDO (JPNP18002), JSPS KAKENHI Grant Number JP20H04240, and JST PRESTO Grant Number JPMJPR2133.

Author information

Authors and Affiliations

Yokohama National University, Kanagawa, Japan
Yuhei Noda, Shota Saito & Shinichi Shirakawa
SkillUp AI Co., Ltd., Tokyo, Japan
Shota Saito

Authors

Yuhei Noda
View author publications
You can also search for this author in PubMed Google Scholar
Shota Saito
View author publications
You can also search for this author in PubMed Google Scholar
Shinichi Shirakawa
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shinichi Shirakawa .

Editor information

Editors and Affiliations

University of the West of England, Bristol, UK
Elias Pimenidis
Lancaster University, Lancaster, UK
Plamen Angelov
Digital Innovation, Teeside University, Middlesbrough, UK
Chrisina Jayne
Democritus University of Thrace, Xanthi, Greece
Antonios Papaleonidas
The University of the West of England, Bristol, UK
Mehmet Aydin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Noda, Y., Saito, S., Shirakawa, S. (2022). Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling. In: Pimenidis, E., Angelov, P., Jayne, C., Papaleonidas, A., Aydin, M. (eds) Artificial Neural Networks and Machine Learning – ICANN 2022. ICANN 2022. Lecture Notes in Computer Science, vol 13532. Springer, Cham. https://doi.org/10.1007/978-3-031-15937-4_51

Download citation

DOI: https://doi.org/10.1007/978-3-031-15937-4_51
Published: 07 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-15936-7
Online ISBN: 978-3-031-15937-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Search of Multiple Neural Architectures with Different Complexities via Importance Sampling