Abstract
Genetic programming, in conjunction with advanced analytical instruments, is a novel tool for the investigation of complex biological systems at the whole-tissue level. In this study, samples from tomato fruit grown hydroponically under both high- and low-salt conditions were analysed using Fourier-transform infrared spectroscopy (FTIR), with the aim of identifying spectral and biochemical features linked to salinity in the growth environment. FTIR spectra of whole tissue extracts are not amenable to direct visual analysis, so numerical modelling methods were used to generate models capable of classifying the samples based on their spectral characteristics. Genetic programming (GP) provided models with a better prediction accuracy to the conventional data modelling methods used, whilst being much easier to interpret in terms of the variables used. Examination of the GP-derived models showed that there were a small number of spectral regions that were consistently being used. In particular, the spectral region containing absorbances potentially due to a cyanide/nitrile functional group was identified as discriminatory. The explanatory power of the GP models enabled a chemical interpretation of the biochemical differences to be proposed. The combination of FTIR and GP is therefore a powerful and novel analytical tool that, in this study, improves our understanding of the biochemistry of salt tolerance in tomato plants.
Similar content being viewed by others
References
G. P. Aboulseman, E. Gifford and B. R. Hunt, Opt. Eng. vol. 33 pp. 2562–2571, 1994.
B. K. Alsberg, D. B. Kell and R. Goodacre, Anal. Chem. vol. 70 pp. 4126–4133, 1998.
P. J. Angeline, “Subtree Crossover Causes Bloat,” in GP '98 Proceedings of the Third Annual Genetic Programming Conference, R. J. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba and R. L. Riolo eds., Morgan Kaufmann: Madison, WI, 1998, pp. 745–752.
W. Banzhaf, P. Nordin, R. Keller and F. Francone, Genetic Programming—An Introduction, San Francisco, CA: Academic Press, 1999.
C. M. Bishop, Neural Networks for Pattern Recognition, Clarendon Press: Oxford, 1995.
P. Bork, T. Dandekar and Y. Diaz-Lazcoz, J. Mol. Biol. vol. 283 pp. 707–725, 1998.
D. Bouchez and H. Hofte, Plant Physiol. vol. 118 pp. 725–732, 1998.
I. Bratko and S. H. Muggleton, Comm. ACM vol. 38 pp. 65–70, 1995.
D. R. Causton, A Biologist's Advanced Mathematics, Allen and Unwin: London, 1987.
S. T. Cole, R. Brosch and J. Parkhill, Nature vol. 393 pp. 537–544, 1998.
R. J. Gilbert, R. Goodacre and B. Shann, “Genetic Programming-Based Variable Selection for High-Dimensional Data,” in GP '98 Proceedings of the Third Annual Genetic Programming Conference, R. J. Koza, W. Banzhaf, K. Chellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba and R. L. Riolo eds., Morgan Kaufmann: Madison, WI, 1998.
R. J. Gilbert, R. Goodacre, A. M. Woodward and D. B. Kell, Anal. Chem. vol. 69 pp. 4381–4389, 1997.
R. Goodacre, M. J. Neal and D. B. Kell, Anal. Chem. vol. 66 pp. 1070–1085, 1994.
R. Goodacre, M. J. Neal and D. B. Kell, Z. Bakteriol.-Int. J. Med. Microbiol. Virol. Parasitol. Infect. Dis. vol. 284 pp. 516–539, 1996a.
R. Goodacre, B. Shann, R. J. Gilbert, E. M. Timmins, A. C. McGovern, B. K. Alsberg, D. B. Kell and N. A. Logan, Anal. Chem. vol. 72 pp. 119–127, 2000.
R. Goodacre, E. M. Timmins, R. Burton, N. Kaderbhai, A. M. Woodward, D. B. Kell and P. J. Rooney, Microbiology—UK vol. 144 pp. 1157–1170, 1998.
R. Goodacre, E. M. Timmins, P. J. Rooney, J. J. Rowland and D. B. Kell, FEMS Microbiol. Lett. vol. 140 pp. 233–239, 1996b.
P. R. Griffiths and J. A. de Haseth, Fourier Transform Infrared Spectrometry, John Wiley: New York, 1986.
H. W. M. Hilhorst, S. P. C. Groot and R. J. Bino, Acta Bot. Neerl. vol. 47 pp. 169–183, 1998.
H. W. M. Hilhorst and P. E. Toorop, Adv. Agron. vol. 61 pp. 111–165, 1997.
J. C. D. Hinton, Mol. Microbiol. vol. 26 pp. 417–422, 1997.
A. C. Hulme, The Biochemistry of Fruits and Their Products, Academic Press: London, 1970.
I. T. Jolliffe, Principal Component Analysis, Springer-Verlag: New York, 1986.
A. Jones, A. D. Shaw and G. J. Salter, “The Exploitation of Chemometric Methods in the Analysis of Spectroscopic Data: Application to Olive Oils,” in Lipid Analysis of oils and fats, R. J. Hamilton ed., Chapman and Hall: London, 1998a, pp. 317–376.
A. Jones, D. Young, J. Taylor, D. B. Kell and J. J. Rowland, Biotechnol. Bioeng. vol. 59 pp. 131–143, 1998b.
J. R. Koza, Genetic Programming: On the Programming of Computers by Means of Natural Selection, MIT Press: Cambridge, MA, 1992.
W. B. Langdon, Genetic Programming and Data Structures, Kluwer Academic Publishers: Dordrecht, 1998.
W. B. Langdon and R. Poli, “Fitness Causes Bloat: Mutation,” in Genetic Programming. First European Workshop, EuroGP'98, W. Banzhaf, R. Poli, M. Schoenauer and T. C. Fogarty eds., Springer: Paris, France, 1998, pp. 37–48.
M. H. Mahmoud, A. S. El-Beltagy, R. M. Helal and M. A. Maksoud, Acta Horticult. vol. 190 pp. 559–565, 1986a.
M. H. Mahmoud, R. A. Jones and A. S. El-Beltagy, Acta Horticult. vol. 190 pp. 533–543, 1986b.
H. Martens and T. Naes, Multivariate Calibration, John Wiley: Chichester, 1989.
Y. Mizrahi, Plant Physiol. vol. 69 pp. 966–970, 1982.
D. Naumann, C. P. Schultz and D. Helm, “What Can Infrared Spectroscopy Tell Us about the Structure and Composition of Intact Bacterial Cells?,” in Infrared Spectroscopy of Biomolecules, H. H. Mantsch and D. Chapman eds., John Wiley: New York, 1996, pp. 279–310.
S. G. Oliver, M. K. Winson, D. B. Kell and F. Baganaz, Trends Biotechnol. vol. 16 pp. 373–378, 1998.
B. D. Ripley, Pattern Recognition and Neural Networks, Cambridge University Press, Cambridge, UK, 1996.
B. Schrader, Infrared and Raman Spectroscopy: Methods and Applications, Verlag Chemie: Weinheim, 1995.
M. B. Seaholtz and B. Kowalski, Anal. Chim. Acta vol. 277 pp. 165–177, 1993.
J. Taylor, J. J. Rowland and R. Goodacre, “Genetic Programming in the Interpretation of Fourier Transform Infrared Spectra: Quantification of Metabolites of Pharmaceutical Importance,” in GP '98 Proceedings of the Third Annual Genetic Programming Conference, J. R. Koza, W. Banzhaf, K. Cellapilla, K. Deb, M. Dorigo, D. B. Fogel, M. H. Garzon, D. E. Goldberg, H. Iba and R. L. Riolo eds., Morgan Kaufmann: San Francisco, CA, 1998a, pp. 377–380.
J. Taylor, M. K. Winson, R. Goodacre, J. J. Rowland and D. B. Kell, FEMS Microbiol. Lett. vol. 160 pp. 237–246, 1998b.
P. D. Wasserman, Neural Computing: Theory and Practice, Van Nostrand Reinhold: New York, 1989.
M. C. Whitlock and N. H. Barton, Genetics, vol. 146 pp. 427–441, 1997.
M. K. Winson, R. Goodacre, E. M. Timmins, A. Jones, B. K. Alsberg, A. M. Woodward, J. J. Rowland and D. B. Kell, Anal. Chim. Acta vol. 348 pp. 273–282, 1997.
A. M. Woodward, R. J. Gilbert and D. B. Kell, Bioelectrochem. Biogenet. vol. 48 pp. 389–396, 1999.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Johnson, H.E., Gilbert, R.J., Winson, M.K. et al. Explanatory Analysis of the Metabolome Using Genetic Programming of Simple, Interpretable Rules. Genetic Programming and Evolvable Machines 1, 243–258 (2000). https://doi.org/10.1023/A:1010014314078
Issue Date:
DOI: https://doi.org/10.1023/A:1010014314078