Abstract
Simulation model must be validated with experimental data to correctly predict the outputs of engineered systems before they can be used with confidence. While doing so, pointwise comparison between predicted output by simulation model and experimental data for model verification and validation (V&V) is not appropriate since real-world phenomena are not deterministic due to existence of irreducible uncertainty. Thus, the output prediction by a simulation model needs to be represented by a certain probability density function (PDF). Statistical model validation methods are necessary to compare the model prediction and physical test data. The validation of a simulation model entails the acquisition of extraordinarily detailed test data, which is expensive to generate, and practicing engineers can afford only a very limited number of test data. This paper proposes an effective method to validate simulation model by using a target output distribution, which closely approximates the true output distribution. Furthermore, the proposed target output distribution accounts for a biased simulation model with stochastic outputs—specifically, simulation output distribution—using limited numbers of input and output test data. Since limited test data may involve outlier or be sparse, a data quality checking process is proposed to determine whether a given output test data needs to be balanced. If necessary, stratified sampling using cluster analysis is employed to sample balanced test data. Next, Bayesian analysis is used to obtain many possible candidates of target output distributions, from which the one at the posterior median is selected. Then, the distribution of bias can be identified using Monte Carlo convolution. Three engineering examples are used to demonstrate that (1) the developed target output distribution closely approximates the true output distribution and is robust under different sets of test data; (2) the reallocated test dataset by a quality checking process and balance sampling leads to better matching to the true output distribution; and (3) the distribution of bias is effectively used to understand the model’s accuracy and model confidence for comparison study.
Similar content being viewed by others
Abbreviations
- AKDE:
-
adaptive KDE
- B i(x):
-
unknown model bias for ith output response
- CAE:
-
computer-aided engineering
- CDF:
-
cumulative distribution function
- DKG:
-
dynamic kriging
- \( \hat{f}(y) \) :
-
output PDF using AKDE
- \( {G}_i\left(\boldsymbol{x}\right),{G}_i^{true}\left(\boldsymbol{x}\right) \) :
-
biased simulation output and true output of ith constraint
- h(y; h 0):
-
adaptive bandwidth in AKDE
- h 0 :
-
global fixed bandwidth for modeling output distribution
- ISFC:
-
indicated specific fuel consumption
- K :
-
kernel
- KDE:
-
kernel density estimation
- M :
-
number of MCS samples
- MAE:
-
mean absolute error
- MAP:
-
maximum a posteriori probability
- MCMC:
-
Markov Chain Monte Carlo
- MCS:
-
Monte Carlo simulation
- MSE:
-
mean squared error
- \( {\hat{\mu}}_{h_0} \) , \( {{\hat{\sigma}}_{h_0}}^2 \) :
-
mean and variance of prior distribution for h0
- UQ:
-
uncertainty quantification
- PDF:
-
probability density function
- P(h 0):
-
prior distribution of bandwidth
- P(h 0|y e):
-
posterior distribution of bandwidth given output data
- RBDO:
-
reliability-based design optimization
- RPM:
-
revolutions per minute
- STD:
-
standard deviation
- V&V:
-
verification and Validation
- \( {\boldsymbol{y}}_i^e,{\boldsymbol{y}}^e \) :
-
ith output data and output data vector
- \( {\boldsymbol{x}}_{ik}^e \) :
-
kth element of the collected input data vector \( {\boldsymbol{x}}_i^e \)
- X i :
-
ith input random variable
References
Arendt PD, Apley DW, Chen W (2012) Quantification of model uncertainty: calibration, model discrepancy, and identifiability. J Mech Des 134:100908. https://doi.org/10.1115/1.4007390
Brooks SP, Gelman A (1998) General methods for monitoring convergence of iterative simulations. J Comput Graph Stat 7(4):434–455
Cho H, Choi KK, Gaul N, Lee I, Lamb D, Gorsich D (2016) Conservative reliability-based design optimization method with insufficient input data. Special issue: physical, model, and statistical uncertainty in. Struct Multidiscip Optim 54(6):1–22. https://doi.org/10.1007/s00158-016-1492-4
Chowdhury FN, Kolber ZS, Barkley MD (1991) Monte Carlo convolution method for simulation and analysis of fluorescence decay data. Rev Sci Instrum 62(1):47–52
Du L, Choi KK (2008) An inverse analysis method for design optimization with both statistical and fuzzy uncertainties. Struct Multidiscip Optim 37(2):107–119
Ferson S, Oberkampf WL, Ginzburg L (2008) Model validation and predictive capability for the thermal challenge problem. Comput Methods Appl Mech Eng 197:2408–2430. https://doi.org/10.1016/j.cma.2007.07.030
Gu L, Yang RJ, Tho CH, Makowskit M, Faruquet O, Li Y (2001) Optimization and robustness for crashworthiness of side impact. Int J Veh Des 26(4):348–360. https://doi.org/10.1504/IJVD.2001.005210
Gunawan S, Papalambros PY (2006) A Bayesian approach to reliability-based optimization with incomplete information. J Mech Des 128(4):909–918. https://doi.org/10.1115/1.2204969
Henderson DJ, Parmeter CF (2012) Normal reference bandwidths for the general order, multivariate kernel density derivative estimator. Stat Probabil Lett 82(12):2198–2205. https://doi.org/10.1016/j.spl.2012.07.020
Higdon D, Nakhleh C, Gattiker J, Williams B (2008) A Bayesian calibration approach to the thermal problem. Comput Methods Appl Mech Eng 197:2431–2441
Jiang Z, Chen W, Fu Y, Yang RJ (2013) Reliability-based design optimization with model bias and data uncertainty. SAE Int J Manuf Mater 6(2013-01-1384):502–516. https://doi.org/10.4271/2013-01-1384
Jones TA (1977) A computer method to calculate the convolution of statistical distributions. J Int Assoc Math Geol 9(6):635–647
Jung BC, Yoon H, Oh H, Lee G, Yoo M, Youn BD, Huh YC (2016) Hierarchical model calibration for designing piezoelectric energy harvester in the presence of variability in material properties and geometry. Struct Multidiscip Optim 53:161–173
Kennedy MC, O'Hagan A (2001) Bayesian calibration of computer models. J R Stat Soc Ser B Stat Methodol 63(3):425–464. https://doi.org/10.1111/1467-9868.00294
Li W, Chen W, Jiang Z, Lu Z, Liu Y (2014) New validation metrics for models with multiple correlated responses. Reliab Eng Syst Saf 127:1–11
Liu Y, Chen W, Arendt P, Huang HZ (2011) Toward a better understanding of model validation metrics. J Mech Des 133(7):071005
McFarland J, Mahadevan S, Romero V, Swileir L (2008) Calibration and uncertainty analysis for computer simulations with multivariate output. AIAA J 46(5):1253–1265
Moon MY, Choi KK, Cho H, Gaul N, Lamb D, Gorsich D (2017) Reliability-based design optimization using confidence-based model validation for insufficient experimental data. J Mech Des 139(3):031404. https://doi.org/10.1115/1.4035679
Moon MY, Cho H, Choi KK, Gaul N, Lamb D, Gorsich D (2018a) Confidence-based reliability assessment considering limited numbers of both input and output test data. Struct Multidiscip Optim 57(5):2027–2043
Moon MY, Choi KK, Gaul N, Lamb D (2018b) Treating epistemic uncertainty using bootstrapping selection of input distribution model for confidence-based reliability assessment. J Mech Des (Accepted). https://doi.org/10.1115/1.4042149
Mourelatos ZP, Zhou J (2005) Reliability estimation and design with insufficient data based on possibility theory. AIAA J 48(8):1696–1705
Noh Y, Choi KK, Lee I, Gorsich D (2011) Reliability-based design optimization with confidence level for non-gaussian distributions using bootstrap method. J Mech Des ASME 133(9):091001. https://doi.org/10.1115/1.4004545
Oberkampf WL, Roy C (2010) Verification and validation in scientific computing. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511760396
Pan H, Xi Z, Yang RJ (2016) Model uncertainty approximation using a copula-based approach for reliability based design optimization. Struct Multidiscip Optim 54(6):1543–1556. https://doi.org/10.1007/s00158-016-1530-2
Papalambros PY, Wilde DJ (2000) Principles of optimal design: modeling and computation. Cambridge University Press, Cambridge
Picheny V, Kim NH, Haftka RT (2010) Application of bootstrap method in conservative estimation of reliability with limited samples. Struct Multidiscip Optim 41(2):205–217. https://doi.org/10.1007/s00158-009-0419-8
RAMDO Software (2018) RAMDO solutions. LLC, Iowa City https://www.ramdosolutions.com. Accessed 8 Aug 2018
Rao SS, Rao SS (2009) Engineering optimization: theory and practice. John Wiley & Sons
Rebba R, Mahadevan S (2008) Computational methods for model reliability assessment. Reliab Eng Syst Saf 93:1197–1207
Sen O, Davis S, Jacobs G, Udaykumar HS (2015) Evaluation of convergence behavior of metamodeling techniques for bridging scales in multi-scale multimaterial simulation. J Comput Phys 294:585–604. https://doi.org/10.1019/j.jcp.2015.03.043
Silverman BW (1986) Density estimation for statistics and data analysis. Chapman & Hall, London
Srivastava R, Deb K (2013) An evolutionary based Bayesian design optimization approach under incomplete information reliability based design optimization for complete information. Eng Opt 45(2):141–165
Thompson SK (2012) Sampling, 3rd edn. Wiley, Hoboken
Tipton E (2013) Stratified sampling using cluster analysis: a sample selection strategy for improved generalizations from experiments. Eval Rev 37(2):109–139
Volpi S, Diez M, Gaul NJ, Song H, Iemma U, Choi KK, Campana EF, Stern F (2014) Development and validation of a dynamic metamodel based on stochastic radial basis functions and uncertainty quantification. Struct Multidiscip Optim 51(2):347–368. https://doi.org/10.1007/s00158-014-1128-5
Wang S, Chen W, Tsui KL (2009a) Bayesian validation of computer models. Technometrics 51(4):439–451. https://doi.org/10.1198/tech.2009.07011
Wang P, Youn BD, Xi Z, Kloess A (2009b) Bayesian reliability analysis with evolving, insufficient, and subjective data sets. J Mech Des 131(11):111008
Xi Z (2019) Model-based reliability analysis with both model uncertainty and parameter uncertainty. J Mech Des 141(5):051404–051404-11. https://doi.org/10.1115/1.4041946
Youn BD, Wang P (2008) Bayesian reliability-based design optimization using eigenvector dimension reduction (EDR) method. Struct Multidiscip Optim 36(2):107–123. https://doi.org/10.1007/s00158-007-0202-7
Youn BD, Choi KK, Yang RJ, Gu L (2004) Reliability-based design optimization for crashworthiness of vehicle side impact. Struct Multidiscip Optim 26(3):272–283. https://doi.org/10.1007/s00158-003-0345-0
Youn BD, Jung BC, Xi Z, Kim SB, Lee W (2011) A hierarchical framework for statistical model calibration in engineering product development. Comput Methods Appl Mech Eng 200:1421–1431
Zaman K, Mahadevan S (2017) Reliability-based design optimization of multidisciplinary system under aleatory and epistemic uncertainty. Struct Multidiscip Optim 55(2):681–699. https://doi.org/10.1007/s00158-016-1532-0
Zhao L, Choi KK, Lee I (2011) Metamodeling method using dynamic kriging for design optimization. AIAA J 49(9):2034–2046. https://doi.org/10.2514/1.J051017
Funding
Technical and financial support was provided by the RAMDO—U.S. Army SBIR Sequential Phase II sub-contract from RAMDO Solutions, LLC.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interests
The authors declare that they have no conflict of interest.
Additional information
Responsible Editor: Byeng D Youn
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Moon, MY., Choi, K.K. & Lamb, D. Target output distribution and distribution of bias for statistical model validation given a limited number of test data. Struct Multidisc Optim 60, 1327–1353 (2019). https://doi.org/10.1007/s00158-019-02338-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00158-019-02338-z