Abstract
I introduce a new method for validating models—including stochastic models—that gets at the reliability of a model’s predictions under intervention or manipulation of its inputs and not merely at its predictive reliability under passive observation. The method is derived from philosophical work on natural kinds, and turns on comparing the dynamical symmetries of a model with those of its target, where dynamical symmetries are interventions on model variables that commute with time evolution. I demonstrate that this method succeeds in testing aspects of model validity for which few other tools exist.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
For an influential engineering perspective, see [1]. For a recent and comprehensive overview of both software and systems modeling aspects from the National Research Council, see [5]. For a pithy and very current overview of verification in the world of software design, see [26]. Finally, for an accessible and illuminating discussion of the state of the art from the perspective of applied mathematics, see [7].
- 2.
Note that this terminology is at odds with machine learning, where each specific set of parameter values constitutes a model. What I’m calling a model is, in the context of machine learning, or statistical learning theory a space of hypotheses or class of models.
- 3.
Most textbooks on machine learning include descriptions of cross-validation. An especially lucid presentation can be found in Flach [8, ch. 12].
- 4.
The estimate of the generalization error of a model is biased for cross-validation, but in the direction of over-estimating the error (see [11, ch. 7.10]).
- 5.
Attention to structural validation is curiously discipline dependent. Concepts (such as those pertaining to testing “white-box” models in systems engineering) seem to have relatively little penetration in other fields such as ecology. This is probably partly due to the quantity and precision of data available in these different fields. Structural tests tend to be data-hungry or to require manipulations of the target system that are not available to, e.g., field ecologists.
- 6.
See [3] for a widely-cited review.
- 7.
Balci [1] calls this “stress testing.”
- 8.
As indicated in [13], I am using the term “intervention” in its technical sense as it appears in the literature on causation. In this context, “…an intervention on X (with respect to Y) is a causal process that directly changes the value of X in such a way that, if a change in the value of Y should occur, it will occur only through the change in the value of X and not in some other way”[27].
- 9.
In principle, one could take a single long time series for each system and cut it in half to obtain two such curves, but for ease of exposition, I assume the time series are obtained separately.
- 10.
For an English translation of the French, see [25].
- 11.
Another equally old and venerable model is that of Gompertz [10]. This model also continues to be deployed for growth modeling.
- 12.
Note that the initial value of the population, x 0 is fit independently in each case. That’s because, while the other parameters are presumed to be intrinsic features of the growing population, the initial population size is variable and assumed to have different (unknown) values in each case.
- 13.
This is the line of reasoning presented in [29], where the Gompertz model is favored.
- 14.
It’s generally possible to determine and fit symmetries numerically, without an analytic, closed form solution. But since one is available in this case, I use it to simplify the analysis.
- 15.
This data was obtained from Connelly [6] and is used here with permission (and gratitude). The dataset can be found at https://zenodo.org/record/1171129. I am specifically considering the sixteenth row of the table.
References
Balci O (1994) Validation, verification, and testing techniques throughout the life cycle of a simulation study. Ann Oper Res 53(1):121–173
Barlas Y (1989) Multiple tests for validation of system dynamics type of simulation models. Eur J Oper Res 42(1):59–87
Barlas Y (1996) Formal aspects of model validity and validation in system dynamics. Syst Dyn Rev 12(3):183–210
Buchanan RL, Whiting RC, Damert WC (1997) When is simple good enough: a comparison of the Gompertz, Baranyi, and three-phase linear models for fitting bacterial growth curves. Food Microbiol 14(4):313–326.
Committee on Mathematical Foundations of Verification, Validation, and Uncertainty Quantification (2012) Assessing the reliability of complex models: mathematical and statistical foundations of verification, validation, and uncertainty quantification. National Academy Press, Washington
Connelly B (2014) Data set for ‘analyzing microbial growth with R’. https://doi.org/10.5281/zenodo.1171129
Fillion N (2017) The vindication of computer simulations. In: Lenhard J, Carrier M (eds) Mathematics as a tool: tracing new roles of mathematics in the sciences. Springer, Cham, pp 137–155
Flach P (2012) Machine learning: the art and science of algorithms that make sense of data. Cambridge University Press, Cambridge
Fujikawa H, Kai A, Morozumi S (2004) A new logistic model for Escherichia coli growth at constant and dynamic temperatures. Food Microbiol 21(5):501–509
Gompertz B (1825) XXIV. On the nature of the function expressive of the law of human mortality, and on a new mode of determining the value of life contingencies. In a letter to Francis Baily, Esq. F. R. S. &c. Philos Trans R Soc Lond 115:513–583
Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning. Springer series in statistics, 2nd edn. Springer, New York
Higham DJ (2001) An algorithmic introduction to numerical simulation of stochastic differential equations. SIAM Rev 43(3):525–546
Jantzen BC (2014) Projection, symmetry, and natural kinds. Synthese 192(11):3617–3646
Jantzen BC (2017) Dynamical kinds and their discovery. In: Proceedings of the UAI 2016 workshop on causation: foundation to application. ArXiv: 1612.04933
Ling Y, Mahadevan S (2013) Quantitative model validation techniques: new insights. Reliab Eng Syst Saf 111:217–231
Liu M, Fan M (2017) Permanence of stochastic Lotka–Volterra systems. J Nonlinear Sci 27(2):425–452
McCarthy MA, Broome LS (2000) A method for validating stochastic models of population viability: a case study of the mountain pygmy-possum (Burramys parvus). J Anim Ecol 69(4):599–607
Miller JH (1998) Active nonlinear tests (ANTs) of complex simulation models. Manag Sci 44(6):820–830
Rhinehart RR (2016) Nonlinear regression modeling for engineering applications: modeling, model validation, and enabling design of experiments. Wiley, Hoboken
Skiadas CH (2010) Exact solutions of stochastic differential equations: Gompertz, generalized logistic and revised exponential. Methodol Comput Appl Probab 12(2):261–270
Sokal RR, Rohlf FJ (1994) Biometry: the principles and practices of statistics in biological research, 3rd edn. W. H. Freeman, New York
Spirtes P, Glymour CN, Scheines R (2000) Causation, prediction, and search. Adaptive computation and machine learning, 2nd edn. MIT Press, Cambridge
Tsoularis A, Wallace J (2002) Analysis of logistic growth models. Math Biosci 179(1):21–55
Verhulst PF (1838) Notice sur la loi que la populations suit dans son accroissement. Correspondence Mathématique et Physique. X:113–121
Vogels M et al (1975) P. F. Verhulst’s ‘Notice sur la loi que la populations suit dans son accroissement’ from correspondence mathematique et physique. Ghent, vol. X, 1838. J Biol Phys 3(4):183–192
Wilcox JR (2018) Research for practice: highlights in systems verification. Commun ACM 61(2):48–49
Woodward J (2001) Law and explanation in biology: invariance is the kind of stability that matters. Philos Sci 68(1):1–20
Zeigler BP, Praehofer H, Kim TG (2000) Theory of modeling and simulation, 2nd edn. Academic, San Diego
Zwietering MH et al (1990) Modeling of the bacterial growth curve. Appl Environ Microbiol 56(6):1875–1881
Acknowledgements
I am grateful to the participants in the 2015 Algorithms and Complexity in Mathematics, Epistemology and Science (ACMES) conference for insightful discussion of an early algorithm for discovering dynamical kinds, to Cosmo Grant for pointing out a physical inconsistency in the first version of one of my examples, and to Nicolas Fillion for helpful comments on a previous draft of this paper. The work presented here was supported by the National Science Foundation under Grant No. 1454190.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Science+Business Media, LLC, part of Springer Nature
About this paper
Cite this paper
Jantzen, B.C. (2019). Dynamical Symmetries and Model Validation. In: Fillion, N., Corless, R., Kotsireas, I. (eds) Algorithms and Complexity in Mathematics, Epistemology, and Science. Fields Institute Communications, vol 82. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-9051-1_6
Download citation
DOI: https://doi.org/10.1007/978-1-4939-9051-1_6
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-9050-4
Online ISBN: 978-1-4939-9051-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)