Abstract
Modern science is turning to progressively more complex and data-rich subjects, which challenges the existing methods of data analysis and interpretation. Consequently, there is a pressing need for development of ever more powerful methods of extracting order from complex data and for automation of all steps of the scientific process. Virtual Scientist is a set of computational procedures that automate the method of inductive inference to derive a theory from observational data dominated by nonlinear regularities. The procedures utilize SINBAD – a novel computational method of nonlinear factor analysis that is based on the principle of maximization of mutual information among non-overlapping sources, yielding higher-order features of the data that reveal hidden causal factors controlling the observed phenomena. The procedures build a theory of the studied subject by finding inferentially useful hidden factors, learning interdependencies among its variables, reconstructing its functional organization, and describing it by a concise graph of inferential relations among its variables. The graph is a quantitative model of the studied subject, capable of performing elaborate deductive inferences and explaining behaviors of the observed variables by behaviors of other such variables and discovered hidden factors. The set of Virtual Scientist procedures is a powerful analytical and theory-building tool designed to be used in research of complex scientific problems characterized by multivariate and nonlinear relations.
Similar content being viewed by others
References
Becker S (1999) Implicit learning in 3d object recognition: the importance of temporal context. Neural Computation 11: 347–374
Becker S (1996) Mutual information maximization: models of cortical self-organization. Network 7(1): 7–31
Becker S (1995) JPMAX: learning to recognize moving objects as a model-fitting problem. In: Advances in Neural Information Processing Systems 7, pp 933–940. Morgan Kaufmann Publishers, San Mateo, CA
Becker S and Hinton GE (1992) A self-organizing neural network that discovers surfaces in random-dot stereograms. Nature 355: 161–163
Bishop CM (1995) Neural Networks for Pattern Recognition. Oxford University Press, Oxford
Boyen X, Friedman N and Koller D (1999) Discovering the hidden structure of complex dynamic systems. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI 99), pp. 91–100. Stockholm, Sweden
Clark A and Thornton C (1997) Trading places: computation, representation, and the limits of uninformed learning. Behavioral and Brain Sciences 20: 57–90
Favorov OV and Ryder D (2004) SINBAD: A neocortical mechanism for discovering environmental variables and regularities hidden in sensory input. Biological Cybernetics (in press)
Favorov OV, Ryder D, Hester JT, Kelly DG and Tommerdahl M (2003) The cortical pyramidal cell as a set of interacting error backpropagating networks: a mechanism for discovering nature's order. In: Hecht-Nielsen R and McKenna T (eds) Computational Models for Neuroscience, pp. 25–64. Springer Verlag, London
Hanson SJ and Pratt LY (1989) Comparing biases for minimal network construction with backpropagation. In: Touretzky DS (ed) Advances in Neural Information Processing Systems, vol. 1, pp. 177–185. Morgan Kaufmann, San Mateo, CA
Hassibi B and Stork DG (1993) Second order derivatives for network pruning: optimal brain surgeon. In: Hanson SJ, Cowan JD, and Giles CL (eds) Advances in Neural Information Processing Systems, vol. 5, pp. 164–171. Morgan Kaufmann, San Mateo, CA
Hyvarinen A, Karhunen J and Oja E (2001) Independent Component Analysis. John Wiley & Sons, Toronto
Ilin A and Valpola H (2003) On the effect of the form of the posterior approximation in variational learning of ICA models. In Proc. of the 4th Int. Symp. on Independent Component Analysis and Blind Signal Separation (ICA2003), pp. 915–920. Nara, Japan
Jordan MI (2004) Graphical models. Statistical Science (to appear)
Joshi S, Kursun O and Favorov OV (2003) Exploiting the structure of order: an application to natural images. The 7th World Multiconference on Systemics, Cybernetics and Informatics, Orlando, FL
Jutten C and Karhunen J (2003) Advances in nonlinear blind source separation. In Proc. of the 4th Int. Symp. on Independent Component Analysis and Blind Signal Separation (ICA2003), pp. 245–256
Kramer MA (1991) Nonlinear principal component analysis using autoassociative neural networks. Aiche Journal 37(2): 233–243
Kursun O and Favorov OV (2003) Single-frame super-resolution by inference from learned features. Istanbul University Journal of Electrical & Electronics Engineering 3(1): 673–681
Kursun O and Favorov OV (2002) Single-frame super-resolution by a cortex based mechanism using high level visual features in natural images. IEEE Workshop on Applications of Computer Vision, Orlando, FL
Lang KJ and Hinton GE (1989) Dimensionality reduction and prior knowledge in e-set recognition. NIPS 1989: 178–185
Lappalainen H and Honkela A (2000) Bayesian nonlinear independent component analysis by multi-layer perceptrons. In: Girolami M (ed) Advances in Independent Component Analysis, pp 93–121. Springer-Verlag. MATLAB toolbox is available at http://www.cis.hut.fi/projects/bayes/software/
LeCun Y, Denker JS and Solla SA (1990) Optimal brain damage. In: Touretzky DS (ed) Advances in Neural Information Processing Systems, vol. 2, pp. 598–605. Morgan Kaufmann, San Mateo, CA
Mjolsness E and DeCoste D (2001) Machine learning for science: state of the art and future prospects. Science 293: 2051–2055
Phillips WA and Singer W (1997) In search of common foundations for cortical computation. Behavioral and Brain Sciences 20: 657–722
Rumelhart DE, Hinton GE and Williams RJ (1986) Learning internal representations by error propagation. In: Rumelhart DE, Mcclelland JL and PDP Research Group (eds) Parallel Distributed Processing: Explorations in the Microstructure of Cognition, vol. 1, pp. 318–362. MIT Press, Cambridge, Mass
Ryder D (2004) SINBAD Neurosemantics: a theory of mental representation. Mind & Language (in press)
Ryder D and Favorov OV (2001) The new associationism: a neural explanation for the predictive powers of cerebral cortex. Brain and Mind 2: 161–194
Stone J (1996) Learning perceptually salient visual parameters using spatiotemporal smoothness constraints. Neural Computation 8: 1463–1492
Valpola H, Raiko T and Karhunen J (2001) Building blocks for hierarchical latent variable models. Proc. Int. Conf. on Independent Component Analysis and Signal Separation. San Diego
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Kurşun, O., Favorov, O.V. SINBAD automation of scientific discovery: From factor analysis to theory synthesis. Natural Computing 3, 207–233 (2004). https://doi.org/10.1023/B:NACO.0000027756.50327.26
Issue Date:
DOI: https://doi.org/10.1023/B:NACO.0000027756.50327.26