Skip to main content

An Inductive Logic Programming Approach to Validate Hexose Binding Biochemical Knowledge

  • Conference paper
Inductive Logic Programming (ILP 2009)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5989))

Included in the following conference series:

Abstract

Hexoses are simple sugars that play a key role in many cellular pathways, and in the regulation of development and disease mechanisms. Current protein-sugar computational models are based, at least partially, on prior biochemical findings and knowledge. They incorporate different parts of these findings in predictive black-box models. We investigate the empirical support for biochemical findings by comparing Inductive Logic Programming (ILP) induced rules to actual biochemical results. We mine the Protein Data Bank for a representative data set of hexose binding sites, non-hexose binding sites and surface grooves. We build an ILP model of hexose-binding sites and evaluate our results against several baseline machine learning classifiers. Our method achieves an accuracy similar to that of other black-box classifiers while providing insight into the discriminating process. In addition, it confirms wet-lab findings and reveals a previously unreported Trp-Glu amino acids dependency.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Bagley, S.C., Altman, R.B.: Characterizing the microenvironment surrounding protein sites. Protein Science 4(4), 622–635 (1995)

    Article  Google Scholar 

  2. Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E.: The protein data bank. Nucleic Acids Research 28(1), 235–242 (2000)

    Article  Google Scholar 

  3. Betts, M.J., Russell, R.B.: Amino acid properties and consequences of substitutions. In: Barnes, M.R., Gray, I.C. (eds.) Bioinformatics for Geneticists, pp. 289–316. John Wiley & Sons, West Sussex (2003)

    Chapter  Google Scholar 

  4. Bobadilla, L., Nino, F., Narasimhan, G.: Predicting and characterizing metal-binding sites using Support Vector Machines. In: Proceedings of the International Conference on Bioinformatics and Applications, Fort Lauderdale, FL, pp. 307–318 (2004)

    Google Scholar 

  5. Chakrabarti, R., Klibanov, A.M., Friesner, R.A.: Computational prediction of native protein ligand-binding and enzyme active site sequences. Proceedings of the National Academy of Sciences of the United States of America 102(29), 10153–10158 (2005)

    Article  Google Scholar 

  6. Davis, J., Burnside, E.S., de Castro Dutra, I., Page, D., Ramakrishnan, R., Santos Costa, V., Shavlik, J.: View Learning for Statistical Relational Learning: With an application to mammography. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, Edinburgh, Scotland, pp. 677–683 (2005)

    Google Scholar 

  7. Davis, J., Burnside, E.S., de Castro Dutra, I., Page, D., Santos Costa, V.: An integrated approach to learning Bayesian Networks of rules. In: Gama, J., Camacho, R., Brazdil, P.B., Jorge, A.M., Torgo, L. (eds.) ECML 2005. LNCS (LNAI), vol. 3720, pp. 84–95. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  8. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. Wiley-Interscience, New York (2001)

    MATH  Google Scholar 

  9. Finn, P., Muggleton, S., Page, D., Srinivasan, A.: Pharmacophore discovery using the Inductive Logic Programming system PROGOL. Machine Learning 30(2-3), 241–270 (1998)

    Article  Google Scholar 

  10. Fox, M.A., Whitesell, J.K.: Organic Chemistry, 3rd edn. Jones & Bartlett Publishers, Boston (2004)

    Google Scholar 

  11. García-Hernández, E., Zubillaga, R.A., Chavelas-Adame, E.A., Vázquez-Contreras, E., Rojo-Domínguez, A., Costas, M.: Structural energetics of protein-carbohydrate interactions: Insights derived from the study of lysozyme binding to its natural saccharide inhibitors. Protein Science 12(1), 135–142 (2003)

    Article  Google Scholar 

  12. Gilis, D., Massar, S., Cerf, N.J., Rooman, M.: Optimality of the genetic code with respect to protein stability and amino-acid frequencies. Genome Biology 2(11), research0049 (2001)

    Google Scholar 

  13. Gold, N.D., Jackson, R.M.: Fold independent structural comparisons of protein-ligand binding sites for exploring functional relationships. Journal of Molecular Biology 355(5), 1112–1124 (2006)

    Article  Google Scholar 

  14. Guex, N., Peitsch, M.C.: SWISS-MODEL and the Swiss-PdbViewer: An environment for comparative protein modeling. Electrophoresis 18(15), 2714–2723 (1997)

    Article  Google Scholar 

  15. Kadirvelraj, R., Foley, B.L., Dyekjær, J.D., Woods, R.J.: Involvement of water in carbohydrate-protein binding: Concanavalin A revisited. Journal of the American Chemical Society 130(50), 16933–16942 (2008)

    Article  Google Scholar 

  16. Khuri, S., Bakker, F.T., Dunwell, J.M.: Phylogeny, function and evolution of the cupins, a structurally conserved, functionally diverse superfamily of proteins. Molecular Biology and Evolution 18(4), 593–605 (2001)

    Google Scholar 

  17. Malik, A., Ahmad, S.: Sequence and structural features of carbohydrate binding in proteins and assessment of predictability using a Neural Network. BMC Structural Biology 7, 1 (2007)

    Article  Google Scholar 

  18. Mitchell, T.M.: Machine Learning. McGraw-Hill International Editions, Singapore (1997)

    MATH  Google Scholar 

  19. Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W.: Prediction of protein-glucose binding sites using Support Vector Machines. Proteins: Structure, Function, and Bioinformatics 77(1), 121–132 (2009)

    Article  Google Scholar 

  20. Quiocho, F.A., Vyas, N.K.: Atomic interactions between proteins/enzymes and carbohydrates. In: Hecht, S.M. (ed.) Bioorganic Chemistry: Carbohydrates, ch. 11, pp. 441–457. Oxford University Press, New York (1999)

    Google Scholar 

  21. Rao, V.S.R., Lam, K., Qasba, P.K.: Architecture of the sugar binding sites in carbohydrate binding proteins—a computer modeling study. International Journal of Biological Macromolecules 23(4), 295–307 (1998)

    Article  Google Scholar 

  22. Santos Costa, V.: The life of a logic programming system. In: de la Banda, M.G., Pontelli, E. (eds.) ICLP 2008. LNCS, vol. 5366, pp. 1–6. Springer, Heidelberg (2008)

    Chapter  Google Scholar 

  23. Shionyu-Mitsuyama, C., Shirai, T., Ishida, H., Yamane, T.: An empirical approach for structure-based prediction of carbohydrate-binding sites on proteins. Protein Engineering 16(7), 467–478 (2003)

    Article  Google Scholar 

  24. Solomon, E., Berg, L., Martin, D.W.: Biology, 8th edn. Brooks Cole, Belmont (2007)

    Google Scholar 

  25. Srinivasan, A.: The Aleph Manual, 4th edn. (2007), http://www.comlab.ox.ac.uk/activities/machinelearning/Aleph/aleph.html

  26. Srinivasan, A., King, R.D., Muggleton, S.H., Sternberg, M.J.E.: Carcinogenesis predictions using ILP. In: Džeroski, S., Lavrač, N. (eds.) ILP 1997. LNCS, vol. 1297, pp. 273–287. Springer, Heidelberg (1997)

    Google Scholar 

  27. Sujatha, M.S., Balaji, P.V.: Identification of common structural features of binding sites in galactose-specific proteins. Proteins: Structure, Function, and Bioinformatics 55(1), 44–65 (2004)

    Article  Google Scholar 

  28. Sujatha, M.S., Sasidhar, Y.U., Balaji, P.V.: Energetics of galactose and glucose-aromatic amino acid interactions: Implications for binding in galactose-specific proteins. Protein Science 13(9), 2502–2514 (2004)

    Article  Google Scholar 

  29. Taroni, C., Jones, S., Thornton, J.M.: Analysis and prediction of carbohydrate binding sites. Protein Engineering 13(2), 89–98 (2000)

    Article  Google Scholar 

  30. Wang, G., Dunbrack, R.L.: PISCES: A Protein Sequence Culling Server. Bioinformatics 19(12), 1589–1591 (2003)

    Article  Google Scholar 

  31. Zhang, Y., Swaminathan, G.J., Deshpande, A., Boix, E., Natesh, R., Xie, Z., Acharya, K.R., Brew, K.: Roles of individual enzyme-substrate interactions by alpha-1,3-galactosyltransferase in catalysis and specificity. Biochemistry 42(46), 13512–13521 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Nassif, H., Al-Ali, H., Khuri, S., Keirouz, W., Page, D. (2010). An Inductive Logic Programming Approach to Validate Hexose Binding Biochemical Knowledge. In: De Raedt, L. (eds) Inductive Logic Programming. ILP 2009. Lecture Notes in Computer Science(), vol 5989. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-13840-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-13840-9_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-13839-3

  • Online ISBN: 978-3-642-13840-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics