Automatically finding the control variables for complex system behavior

Gay, Gregory; Menzies, Tim; Davies, Misty; Gundy-Burlet, Karen

doi:10.1007/s10515-010-0072-x

Automatically finding the control variables for complex system behavior

Published: 29 May 2010

Volume 17, pages 439–468, (2010)
Cite this article

Automated Software Engineering Aims and scope Submit manuscript

Gregory Gay¹,
Tim Menzies¹,
Misty Davies² &
…
Karen Gundy-Burlet²

245 Accesses
24 Citations
Explore all metrics

Abstract

Testing large-scale systems is expensive in terms of both time and money. Running simulations early in the process is a proven method of finding the design faults likely to lead to critical system failures, but determining the exact cause of those errors is still time-consuming and requires access to a limited number of domain experts. It is desirable to find an automated method that explores the large number of combinations and is able to isolate likely fault points.

Treatment learning is a subset of minimal contrast-set learning that, rather than classifying data into distinct categories, focuses on finding the unique factors that lead to a particular classification. That is, they find the smallest change to the data that causes the largest change in the class distribution. These treatments, when imposed, are able to identify the factors most likely to cause a mission-critical failure. The goal of this research is to comparatively assess treatment learning against state-of-the-art numerical optimization techniques. To achieve this, this paper benchmarks the TAR3 and TAR4.1 treatment learners against optimization techniques across three complex systems, including two projects from the Robust Software Engineering (RSE) group within the National Aeronautics and Space Administration (NASA) Ames Research Center. The results clearly show that treatment learning is both faster and more accurate than traditional optimization methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A systematic approach to parameter optimization and its application to flight schedule simulation software

Article Open access 05 July 2022

Regression Models for Machine Learning

$$\mathtt {Tenscalc}$$ : a toolbox to generate fast code to solve nonlinear constrained minimizations and compute Nash equilibria

Article Open access 08 February 2022

References

Acevedo, A., Arnold, J., Othon, W., Berndt, J.: ANTARES: Spacecraft simulation for multiple user communities and facilities. In: AIAA Modeling and Simulation Technologies Conference and Exhibit, pp. 2007–6888 (2007)
Agrawal, R., Imeilinski, T., Swami, A.: Mining association rules between sets of items in large databases. In: Proceedings of the 1993 ACM SIGMOD Conference, Washington, DC, USA (1993). Available from http://citeseer.nj.nec.com/agrawal93mining.html
Antoniol, G., Gueheneuc, Y.: Feature identification: a novel approach and a case study. In: ICSM 2005, pp. 357–366 (2005)
Antoniol, G., Canfora, G., Casazza, G., De Lucia, A., Merlo, E.: Recovering traceability links between code and documentation. IEEE Trans. Softw. Eng. 28(10), 970–983 (2002)
Article Google Scholar
Austin, P., Grootendorst, P., Anderson, G.: A comparison of the ability of different propensity score models to balance measured variables between treated and untreated subjects: a Monte Carlo study. Stat. Med. 26, 734–753 (2007)
Article MathSciNet Google Scholar
Basili, V., McGarry, F., Pajerski, R., Zelkowitz, M.: Lessons learned from 25 years of process improvement: the rise and fall of the NASA software engineering laboratory. In: Proceedings of the 24th International Conference on Software Engineering (ICSE) 2002, Orlando, Florida (2002). Available from http://www.cs.umd.edu/projects/SoftEng/ESEG/papers/83.88.pdf
Bay, S.B., Pazzani, M.J.: Detecting change in categorical data: mining contrast sets. In: Proceedings of the Fifth International Conference on Knowledge Discovery and Data Mining (1999). Available from http://www.ics.uci.edu/pazzani/Publications/stucco.pdf
Bishop, C.: Pattern Recognition and Machine Learning. Springer, New York (2007)
Google Scholar
Boehm, B., Papaccio, P.: Understanding and controlling software costs. IEEE Trans. Softw. Eng. 14(10), 1462–1477 (1988)
Article Google Scholar
Boetticher, G.: An assessment of metric contribution in the construction of a neural network-based effort estimator. In: Second International Workshop on Soft Computing Applied to Software Engineering, Enschade, NL (2001). Available from: http://nas.cl.uh.edu/boetticher/publications.html
Breiman, L., Friedman, J.H., Olshen, R.A., Stone, C.J.: Classification and regression trees. Technical report, Wadsworth International, Monterey, CA (1984)
Cai, C.H., Fu, A.W.C., Cheng, C.H., Kwong, W.W.: Mining association rules with weighted items. In: Proceedings of International Database Engineering and Applications Symposium (IDEAS 98) (August 1998). Available from http://www.cse.cuhk.edu.hk/kdd/assoc_rule/paper.pdf
Cleland-Huang, J., Settimi, R., Zou, X., Solc, P.: The detection and classification of non-functional requirements with application to early aspects. In: RE 2006, pp. 36–45 (2006)
Cornford, S.L., Feather, M.S., Hicks, K.A.: DDP a tool for life-cycle risk management. In: IEEE Aerospace Conference, Big Sky, Montana, pp. 441–451 (March 2001)
Dechter, R.: Constraint Processing. Morgan Kaufmann, San Mateo (2003)
Google Scholar
Eruhimov, V., Martyanov, V., Tuv, E.: Knowledge discovery in databases: PKDD 2007. In: Constructing High Dimensional Feature Space for Time Series Classification, pp. 414–421. Springer, Berlin (2007)
Google Scholar
Fayyad, U., Irani, I.: Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the Thirteenth International Joint Conference on Artificial Intelligence, pp. 1022–1027 (1993)
Feather, M., Cornford, S., Hicks, K., Kiper, J., Menzies, T.: Application of a broad-spectrum quantitative requirements model to early-lifecycle decision making. In: IEEE Software (2008). Available from http://menzies.us/pdf/08ddp.pdf
Fischer, B., Schumann, J.: Autobayes: a system for generating data analysis programs from statistical models. J. Funct. Program. 13, 483–508 (2003)
Article MATH MathSciNet Google Scholar
Gay, G., Menzies, T., Jalali, O., Mundy, G., Gilkerson, B., Feather, M., Kiper, J.: Finding robust solutions in requirements models. Autom. Softw. Eng. 17(1), 87–116 (2010)
Article Google Scholar
Gigerenzer, G., Goldstein, D.G.: Reasoning the fast and frugal way: models of bounded rationality. Psychol. Rev. 650–669 (1996)
Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, San Diego (1981)
MATH Google Scholar
Goldberg, D.E.: Genetic Algorithms in Search, Optimization, and Machine Learning. Addison–Wesley, Reading (1989)
MATH Google Scholar
Gu, J., Purdom, P., Franco, J., Wah, B.: Algorithms for the satisfiability (sat) problem: a survey. In: DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pp. 19–152. American Mathematical Society, Providence (1997)
Google Scholar
Gundy-Burlet, K., Schumann, J., Barrett, T., Menzies, T.: Parametric analysis of ANTARES re-entry guidance algorithms using advanced test generation and data analysis. In: 9th International Symposium on Artificial Intelligence, Robotics and Automation in Space (2007)
Gundy-Burlet, K., Schumann, J., Barrett, T., Menzies, T.: Parametric analysis of a hover test vehicle using advanced test generation and data analysis. In: AIAA Aerospace (2009)
Holte, R.C.: Very simple classification rules perform well on most commonly used datasets. Mach. Learn. 11, 63 (1993)
Article MATH Google Scholar
Holzmann, G.J.: The model checker SPIN. IEEE Trans. Softw. Eng. 23(5), 279–295 (1997)
Article MathSciNet Google Scholar
Hu, Y.: Treatment learning: implementation and application. Master’s thesis, Department of Electrical Engineering, University of British Columbia (2003)
Jing, H., George, R., Tuv, E.: Contributors to a signal from an artificial contrast. In: Informatics in Control, Automation and Robotics II, pp. 71–78. Springer, Berlin (2007)
Chapter Google Scholar
Kirkpatrick, S., Gelatt, C.D., Vecchi, M.P.: Optimization by simulated annealing. Science 4598, 671–680 (1983)
Article MathSciNet Google Scholar
Kohavi, R., John, G.: Wrappers for feature subset selection. Artif. Intell. 97(1–2), 273–324 (1997)
Google Scholar
Mann, H.B., Whitney, D.R.: On a test of whether one of two random variables is stochastically larger than the other. Ann. Math. Stat. 18(1), 50–60 (1947). Available on-line at http://projecteuclid.org/DPubS?service=UI&version=1.0&verb=Display&handle=euclid.aoms/1177730491
Article MATH MathSciNet Google Scholar
Marcus, A., Maletic, J.: Recovering documentation-to-source code traceability links using latent semantic indexing. In: Proceedings of the Twenty-Fifth International Conference on Software Engineering (2003)
Menzies, T., Hu, Y.: Data mining for very busy people. In: IEEE Computer (November 2003). Available from http://menzies.us/pdf/03tar2.pdf
Menzies, T., Sinsel, E.: Practical large scale what-if queries: case studies with software risk assessment. In: Proceedings ASE 2000 (2000). Available from http://menzies.us/pdf/00ase.pdf
Menzies, T., Dekhtyar, A., Distefano, J., Greenwald, J.: Problems with precision. IEEE Trans. Softw. Eng. (September 2007). Available from http://menzies.us/pdf/07precision.pdf
Menzies, T., Greenwald, J., Frank, A.: Data mining static code attributes to learn defect predictors. IEEE Trans. Soft. Eng. (January 2007). Available from http://menzies.us/pdf/06learnPredict.pdf
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys 21, 1087–1092 (1953)
Article Google Scholar
Oakley, J., O’Hagan, A.: Probabilistic sensitivity analysis of complex models: a Bayesian approach. J. R. Stat. Soc. B 66(3), 751–769 (2004)
Article MATH MathSciNet Google Scholar
Orrego, A.S.: Sawtooth: Learning from huge amounts of data. Master’s thesis, Computer Science, West Virginia University (2004)
Quinlan, R.: C4.5: Programs for Machine Learning. Morgan Kaufman, San Mateo (1992). ISBN: 1558602380
Google Scholar
Rose, K., Smith, E., Gardner, R., Brenkert, A., Bartell, S.: Parameter sensitivities, Monte Carlo filtering, and model forecasting under uncertainty. J. Forecast. 10, 117–133 (1991)
Article Google Scholar
Saltelli, A., Chan, K., Scott, E.M.: Sensitivity Analysis. Wiley, New York (2000)
MATH Google Scholar
Saltelli, A., Ratto, M., Andres, T., Campolongo, F., Cariboni, J., Gatelli, D., Saisana, M., Tarantola, S.: Global Sensitivity Analysis: The Primer. Wiley, New York (2008)
MATH Google Scholar
Schumann, J., Gundy-Burlet, K., Pasareanu, C., Menzies, T., Barrett, T.: Tool support for parametric analysis of large software systems. In: Proc. Automated Software Engineering, 23rd IEEE/ACM International Conference (2008)
Schumann, J., Gundy-Burlet, K., Pasareanu, C., Menzies, T., Barrett, A.: Software V&V support by parametric analysis of large software simulation systems. In: 2009 IEEE Aerospace Conference (2009)
Sendall, S., Kozacaynski, W.: Model transformation: the heart and soul of model-driven software development. IEEE Softw. 20(5), 42–45 (2003)
Article Google Scholar
Sims, C.: Matlab optimization software. QM&RBC Codes, Quantitative Macroeconomics & Real Business Cycles (March 1999)
Spear, R., Grieb, T., Shang, N.: Parameter uncertainty and interaction in complex environmental models. Water Resour. Res. 30(11), 3159–3169 (1994)
Article Google Scholar
Taylor, B.J., Darrah, M.A.: Rule extraction as a formal method for the verification and validation of neural networks. In: IJCNN ’05: Proceedings. 2005 IEEE International Joint Conference on Neural Networks, vol. 5, pp. 2915–2920 (2005)
Torkkola, K., Tuv, E.: Ensembles of regularized least squares classifiers for high-dimensional problems. In: Feature Extraction, pp. 297–313. Springer, Berlin (2006)
Chapter Google Scholar
Towell, G., Shavlik, J.: Extracting refined rules from knowledge-based neural networks. Mach. Learn. 13, 71–101 (1993)
Google Scholar
Turhan, B., Menzies, T., Bener, A.B., Di Stefano, J.: On the relative value of cross-company and within-company data for defect prediction. In: Empirical Software Engineering (2009). Available from http://menzies.us/pdf/08ccwc.pdf
Tuv, E., Borisov, A., Torkkola, K.: Best subset feature selection for massive mixed-type problems. In: Intelligent Data Engineering and Automated Learning—IDEAL 2006, pp. 1048–1056. Springer, Berlin (2006)
Chapter Google Scholar
Uribe, T., Stickel, M.: Ordered binary decision diagrams and the Davis-Putnam procedure. In: Proc. of the 1st International Conference on Constraints in Computational Logics, pp. 34–49. Springer, Berlin (1994)
Chapter Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann, San Mateo (1999)
Google Scholar
Witten, I.H., Frank, E.: Data Mining: Practical Machine Learning Tools and Techniques, 2nd edn. Morgan Kaufmann, San Mateo (2005)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

West Virginia University, Morgantown, WV, USA
Gregory Gay & Tim Menzies
NASA Ames Research Center, Moffett Field, CA, USA
Misty Davies & Karen Gundy-Burlet

Authors

Gregory Gay
View author publications
You can also search for this author in PubMed Google Scholar
Tim Menzies
View author publications
You can also search for this author in PubMed Google Scholar
Misty Davies
View author publications
You can also search for this author in PubMed Google Scholar
Karen Gundy-Burlet
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gregory Gay.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Gay, G., Menzies, T., Davies, M. et al. Automatically finding the control variables for complex system behavior. Autom Softw Eng 17, 439–468 (2010). https://doi.org/10.1007/s10515-010-0072-x

Download citation

Received: 11 November 2009
Accepted: 18 May 2010
Published: 29 May 2010
Issue Date: December 2010
DOI: https://doi.org/10.1007/s10515-010-0072-x

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Automatically finding the control variables for complex system behavior

Abstract

Access this article

Similar content being viewed by others

A systematic approach to parameter optimization and its application to flight schedule simulation software

Regression Models for Machine Learning

$$\mathtt {Tenscalc}$$ : a toolbox to generate fast code to solve nonlinear constrained minimizations and compute Nash equilibria

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Automatically finding the control variables for complex system behavior

Abstract

Access this article

Similar content being viewed by others

A systematic approach to parameter optimization and its application to flight schedule simulation software

Regression Models for Machine Learning

$$\mathtt {Tenscalc}$$ : a toolbox to generate fast code to solve nonlinear constrained minimizations and compute Nash equilibria

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation