Abstract
In design engineering problems, the use of surrogate models (also called metamodels) instead of expensive simulations have become very popular. Surrogate models include individual models (regression, kriging, neural network...) or a combination of individual models often called aggregation or ensemble. Since different surrogate types with various tunings are available, users often struggle to choose the most suitable one for a given problem. Thus, there is a great interest in automatic selection algorithms. In this paper, we introduce a universal criterion that can be applied to any type of surrogate models. It is composed of three complementary components measuring the quality of general surrogate models: internal accuracy (on design points), predictive performance (cross-validation) and a roughness penalty. Based on this criterion, we propose two automatic selection algorithms. The first selection scheme finds the optimal ensemble of a set of given surrogate models. The second selection scheme further explores the space of surrogate models by using an evolutionary algorithm where each individual is a surrogate model. Finally, the performances of the algorithms are illustrated on 15 classical test functions and compared to different individual surrogate models. The results show the efficiency of our approach. In particular, we observe that the three components of the proposed criterion act all together to improve accuracy and limit over-fitting.
Similar content being viewed by others
References
Acar E, Rais-Rohani M (2009) Ensemble of metamodels with optimized weight factors. Struct Multidiscip Optim 37(3):279–294
Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Control 19(6):716–723
Arlot S, Celisse A (2010) A survey of cross-validation procedures for model selection. Statist Surv 4:40–79
Chen PW, Wang JY, Lee HM (2004) Model selection of svms using ga approach. In: 2004 IEEE international joint conference on neural networks, 2004. Proceedings, vol 3. IEEE, pp 2035–2040
Duchon J (1977) Splines minimizing rotation-invariant semi-norms in sobolev spaces. In: Constructive theory of functions of several variables. Springer, pp 85–100
Efron B, Tibshirani RJ (1993) An introduction to the bootstrap. Chapman and Hall, London
Forrester AI, Keane AJ (2009) Recent advances in surrogate-based optimization. Prog Aerosp Sci 45 (1):50–79
Friedman JH (1991) Multivariate adaptive regression splines. Ann Stat 19(1):1–67
Gramacy R B, Lee H K (2008) Gaussian processes and limiting linear models. Comput Statist Data Anal 53(1):123–136
Gramacy R B, Lee H K (2009) Adaptive design and analysis of supercomputer experiments. Technometrics, 51(2)
Gramacy R B, Lee H K (2012) Cases for the nugget in modeling computer experiments. Statist Comput 22(3):713–722
Goel T, Haftka RT, Shyy W, Queipo NV (2007) Ensemble of surrogates. Struct Multidiscip Optim 33(3):199–216
Gorissen D, Dhaene T, Turck FD (2009) Evolutionary model type selection for global surrogate modeling. J Mach Learn Res 10:2039–2078
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning, vol 2. Springer, Berlin
James G, Witten D, Hastie T, Tibshirani R (2013) An introduction to statistical learning, vol 6. Springer, Berlin
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: Proceedings of the 14th international joint conference on artificial intelligence. IJCAI’95, vol 2. Morgan Kaufmann Publishers Inc, pp 1137–1143
Lancaster P, Salkauskas K (1981) Surfaces generated by moving least squares methods. Math Comput 37 (155):141–158
Lessmann S, Stahlbock R, Crone SF (2006) Genetic algorithms for support vector machine model selection. In: The 2006 IEEE international joint conference on neural network proceedings. IEEE, pp 3063–3069
Matheron G (1963) Principles of geostatistics. Econ Geol 58(8):1246–1266
McKay MD, Beckman RJ, Conover WJ (1979) Comparison of three methods for selecting values of input variables in the analysis of output from a computer code. Technometrics 21(2):239–245
Müller J, Piché R (2011) Mixture surrogate models based on dempster-shafer theory for global optimization problems. J Glob Optim 51(1):79–104
Nguyen H, Couckuyt I, Knockaert L, Dhaene T, Gorissen D, Saeys Y (2011) An alternative approach to avoid overfitting for surrogate models. In: Proceedings of the 2011 winter simulation conference (WSC), pp 2760–2771
Queipo NV, Haftka RT, Shyy W, Goel T, Vaidyanathan R, Tucker PK (2005) Surrogate-based analysis and optimization. Prog Aerosp Sci 41(1):1–28
Schwarz G et al. (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464
Shi L, Yang R, Zhu P (2012) A method for selecting surrogate models in crashworthiness optimization. Struct Multidiscip Optim 46(2):159–170
Smola A, Schölkopf B (2004) A tutorial on support vector regression. Stat Comput 14(3):199–222
Stone M (1974) Cross-validatory choice and assessment of statistical predictions. J R Stat Soc Ser B Methodol 36(2):111–147
Tomioka S, Nisiyama S, Enoto T (2007) Nonlinear least square regression by adaptive domain method with multiple genetic algorithms. IEEE Trans Evol Comput 11(1):1–16
Viana FA, Haftka RT, Steffen V (2009) Multiple surrogates: how cross-validation errors can help us to obtain the best predictor. Struct Multidiscip Optim 39(4):439–457
Viana FA, Venter G, Balabanov V (2010) An algorithm for fast optimal latin hypercube design of experiments. Int J Numer Methods Eng 82(2):135–156
Watson DF (1981) Computing the n-dimensional delaunay tessellation with application to voronoi polytopes. Comput J 24(2):167–172
Zerpa LE, Queipo NV, Pintos S, Salager JL (2005) An optimization methodology of alkaline–surfactant–polymer flooding processes using field scale numerical simulation and multiple surrogates. J Pet Sci Eng 47(3):197–208
Zhang C, Shao H, Li Y (2000) Particle swarm optimisation for evolving artificial neural network. In: 2000 IEEE international conference on systems, man, and cybernetics, vol 4. IEEE, pp 2487–2490
Zhou X, Jiang T (2016) Metamodel selection based on stepwise regression. Struct Multidiscip Optim 54 (3):641–657
Zhou XJ, Ma YZ, Li XF (2011) Ensemble of surrogates with recursive arithmetic average. Struct Multidiscip Optim 44(5):651–671
Acknowledgements
We gratefully thank Olivier Roustant and Fabrice Gamboa for the help in writing this paper and for their valuable remarks. We also warmly thank two anonymous reviewers for their valuable comments.
Author information
Authors and Affiliations
Corresponding author
Additional information
Malek BEN SALEM is funded by a CIFRE grant from the ANSYS company, subsidized by the French National Association for Research and Technology (ANRT, CIFRE grant number 2014/1349).
Appendices
Appendix A: Comparison between the proposed PPS parameters and the optimal according to the sum of RMSE
Appendix B: Test functions
The equations and the input parameter space of the functions of Table 1 are defined below:
-
1/
Wing weight function:
Parameters: Sw ∈ [150, 200], Wfw ∈ [220, 300], A ∈ [6, 10],
γ ∈ [− 10, 10], q ∈ [16, 45], λ ∈ [0.5, 1], tc ∈ [0.08, 0.18],
Nz ∈ [2.5, 6], Wdg ∈ [1700, 2500], Wp ∈ [0.025, 0.08]
$$\begin{array}{@{}rcl@{}} \text{For } \mathbf{x} &=& (S_{w},W_{fw} A, \gamma, q, \lambda, t_{c}, N_{z}, W_{dg},W_{p} )\\ f_{1}(\mathbf{x}) &=& 0.036 S_{w}^{0.758} W_{fw}^{0.758} \left( \frac{A}{\cos^{2}(\gamma)} \right)^{0.6} q^{0.006} \lambda^{0.04}\\ && \left( \frac{100 t_{c}}{\cos(\gamma)}\right)^{-0.3} (N_{z} W_{dg})^{0.49}+ S_{w} W_{p} \end{array} $$(17) -
2/
Borehole function:
Parameters: rw ∈ [0.05, 0.15], r ∈ [100, 50000],
Tu ∈ [63070, 115600], Hu ∈ [990, 1110], Tl ∈ [63.1, 116],
Hl ∈ [700, 820], L ∈ [1120, 1680], Kw ∈ [9855, 12045]
$$\begin{array}{@{}rcl@{}} \text{For } \mathbf{x} &=& (r_{w},r,T_{u},H_{u},T_{l},H_{l},L,K_{w})\\ f_{2}(\mathbf{x}) &=& \frac{2\pi T_{u}(H_{u} - H_{l})}{\ln\left( \frac{r}{r_{w}}\right) \left( 1 + \frac{2L T_{u}}{ln\left( \frac{r}{r_{w}}\right) {r^{2}_{w}} K_{w}} + \frac{T_{u}}{T_{l}}\right)} \end{array} $$(18) -
3/
Dette and Pepelyshev 8-Dim
Parameters: for all i = 1,…, 8 , xi ∈ [0, 1]
$$\begin{array}{@{}rcl@{}} f_{3}(\mathbf{x}) &=& 4(x_{1} - 2 + 8x_{2} - 8{x_{2}^{2}})^{2} + (3-4x_{2})^{2}\\ && + 16 \sqrt{x_{3} + 1} (2x_{3} -1)^{2} + \sum\limits_{i = 4}^{8} i \ln\left( 1 + \sum\limits_{j = 3}^{i} x_{j}\right)\\ \end{array} $$(19) -
4/
Piston simulation function:
Parameters: M ∈ [30, 60], S ∈ [0.005, 0.020],
V0 ∈ [0.002, 0.010], k ∈ [1, 5] × 103, P0 ∈ [9, 11] × 104,
Ta ∈ [290, 296], T0 ∈ [340, 360]
$$\begin{array}{@{}rcl@{}} f_{4}(\mathbf{x}) &=& 2 \pi \sqrt{\frac{M}{k+S^{2} \frac{P_{0} V_{0}}{T_{0}} \frac{T_{a}}{V^{2}}}}\\ \text{where } V &=& \frac{S}{2k} \left( \sqrt{A^{2} + 4 k \frac{P_{0} V_{0}}{T_{0}} T_{a}} - A \right)\\ \text{and } A &=& P_{0} S + 19.62 M - \frac{k V_{0}}{S} \end{array} $$(20) -
5/
OTL circuit function:
Parameters: Rb1 ∈ [50, 150], Rb2 ∈ [25, 70],
Rf ∈ [0.5, 3], Rc1 ∈ [1.2, 2.5], Rc1 ∈ [0.25, 1.2],
β ∈ [50, 300]
$$\begin{array}{@{}rcl@{}} f_{5}(\mathbf{R},\beta) &=& \frac{\left( \frac{12 R_{b2}}{R_{b1} + R_{b2}} + 0.74\right) \beta (R_{c2} + 9)}{\beta(R_{c2} + 9) + R_{f}}\\ && + \frac{11.35 R_{f}}{\beta(R_{c2} + 9) + R_{f}}\\ && + \frac{0.75 R_{f} \beta (R_{c2} + 9)}{ (\beta(R_{c2} + 9) + R_{f}) R_{c1}} \end{array} $$(21) -
6/
Gramacy and Lee (2009) function:
Parameters: for all i = 1,…, 6 , xi ∈ [0, 1]
$$ f_{6}(\mathbf{x}) = \exp[\sin((0.9(x_{1} + 0.48))^{10})] + x_{2}x_{3} +x_{4} $$(22) -
7/
Friedman function:
Parameters: for all i = 1,…, 5 , xi ∈ [0, 1]
$$ f_{7}(\mathbf{x}) = 10 \sin(\pi x_{1} x_{2}) + 20(x_{3} - 0.5)^{2} + 10 x_{4} + 5x_{5} $$(23) -
8/
Dette & Pepelyshev exponential function:
Parameters: for all i = 1,…, 3 , xi ∈ [0, 1]
$$ f_{8}(\mathbf{x}) = 100(e^{-2/x^{1.75}_{1}} + e^{-2/x^{1.5}_{2}} + e^{-2/x^{1.25}_{3}}) $$(24) -
9/
Dette & Pepelyshev curved function:
Parameters: for all i = 1,…, 3 , xi ∈ [0, 1]
$$\begin{array}{@{}rcl@{}} f_{9}(\mathbf{x}) &=& 4(x_{1} -2 + 8x_{2} - 8{x^{2}_{2}})^{2} + (3-4x_{2})^{2}\\ && + 16 \sqrt{x_{3} + 1} (2 x_{3} -1)^{2} \end{array} $$(25) -
10/
Lim non-polynomial function:
Parameters: x1,x2 ∈ [0, 1]
$$ f_{10}(\mathbf{x}) = \frac{1}{6}[(30 + 5x_{1} \sin(5x_{1})) (4 + \exp(-5x_{2}))-100] $$(26) -
11/
Currin exponential function:
Parameters: x1,x2 ∈ [0, 1]
$$\begin{array}{@{}rcl@{}} f_{11}(\mathbf{x}) &=& \left[1 - \exp\left( - \frac{1}{2x_{2}}\right)\right]\\ && \times \frac{2300{x_{1}^{3}} + 1900{x_{1}^{2}} + 2092x_{1}+ 60}{100{x_{1}^{3}} + 500{x_{1}^{2}} + 4x_{1} + 20} \end{array} $$(27) -
12/
Franke function:
Parameters: x1,x2 ∈ [0, 1]
$$\begin{array}{@{}rcl@{}} f_{12}(\mathbf{x}) &=& 0.75 \exp\left( - \frac{(9x_{1} - 2)^{2} +(9x_{2} - 2)^{2}}{4}\right)\\ && + 0.75 \exp\left( - \frac{(9x_{1} + 2)^{2}}{49} - \frac{9x_{2} + 1}{10}\right)\\ && + 0.5 \exp\left( - \frac{(9x_{1} - 7)^{2}}{4} - \frac{(9x_{2} - 3)^{2}}{4}\right)\\ && + 0.2 \exp(- (9x_{1} - 4)^{2} - (9x_{2} - 7)^{2} ) \end{array} $$(28) -
13/
Gramacy and Lee (2008) function:
Parameters: x1,x2 ∈ [− 2, 6]
$$ f_{13}(\mathbf{x}) = x_{1} \exp(- {x^{2}_{1}} - {x^{2}_{2}}) $$(29) -
14/
Sasena function:
Parameters: x1,x2 ∈ [0.0, 5]
$$\begin{array}{@{}rcl@{}} f_{14}(\mathbf{x}) &=& 2 + 0.01(x_{2}-{x_{1}^{2}})^{2} + (1-x_{1})^{2}\\ && + 2(2-x_{2})^{2} + 7 \sin(0.5x_{1})\sin(0.7x_{1}x_{2})\\ \end{array} $$(30) -
15/
Gramacy and Lee (2012) function:
Parameters: x ∈ [0.5, 2.5]
$$ f_{15}(x) = \frac{\sin(10 \pi x)}{2 x} + (x-1)^{4} $$(31)
Rights and permissions
About this article
Cite this article
Ben Salem, M., Tomaso, L. Automatic selection for general surrogate models. Struct Multidisc Optim 58, 719–734 (2018). https://doi.org/10.1007/s00158-018-1925-3
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-018-1925-3