Skip to main content

Case Study: Structure and Function Prediction of a Protein with No Functionally Characterized Homolog

  • Chapter
Computational Intelligence in Biomedicine and Bioinformatics

Part of the book series: Studies in Computational Intelligence ((SCI,volume 151))

  • 897 Accesses

Summary

The post-genomic era has seen a significant increase in the use of computational prediction methods to gain insights into structure and function of proteins. Prediction tools are used to guide the experimental design to test various hypotheses about structure and function of known proteins. However, these tools are particularly useful when studying putative protein sequences with no known function. The genomic era produced a large number of sequences that are described as either hypothetical proteins or as proteins with unknown function. Current molecular biology techniques are not adequate to efficiently study this vast reservoir of genetic information. However, computer algorithms can process large amounts of sequence data to predict structure and function. These knowledge-based computational tools use available experimental data and are regularly updated to improve their predictive power. The simplest form of function prediction is achieved by comparison of the query sequence to all available sequences using BLAST. If the query sequence is highly similar to previously characterized proteins, then it is likely that the query sequence has similar functions. However, if the query sequence does not have any homologous sequence with known function, then more sophisticated computational tools are necessary to gain insight into structure and function. Various methods have been developed to search for known domains, motifs, patterns, or profiles. The quality of predictions is dependent on the type of tools used and is limited to the closeness of the query sequence to known proteins.

In this chapter, we will describe and discuss methods and tools we used to predict structure and function of a putative protein sequence (Msa) with unknown function. We will address the advantages and limitations of all these approaches by using the Msa protein from the human pathogen Staphylococcus aureus as a case study. Msa is a novel protein that is involved in regulation of virulence. Since Msa has no known homolog, computational tools are being used to predict its structure and mechanism of action. These predictions are used to design experiments to study Msa and explore its use as a therapeutic target to combat antibiotic-resistant infections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25(17), 3389–3402 (1997)

    Article  Google Scholar 

  2. Pandey, G., Kumar, V., Steinbach, M.: Computational approaches for protein function prediction: A survey. Tech. Rep. TR 06-028, Department of Computer Science and Engineering, University of Minnesota (2006)

    Google Scholar 

  3. Sambanthamoorthy, K., Smeltzer, M.S., Elasri, M.O.: Identification and characterization of msa (sa1233), a gene involved in expression of sara and several virulence factors in staphylococcus aureus. Microbiology 152(Pt 9), 2559–2572 (2006)

    Article  Google Scholar 

  4. Nagarajan, V., Elasri, M.O.: Structure and function predictions of the msa protein in staphylococcus aureus. BMC Bioinformatics 8(suppl 7), S5 (2007)

    Article  Google Scholar 

  5. Matsuda, S., Vert, J.P., Saigo, H., Ueda, N., Toh, H., Akutsu, T.: A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14(11), 2804–2813 (2005)

    Article  Google Scholar 

  6. Pasquier, C., Promponas, V.J., Hamodrakas, S.J.: Pred-class: cascading neural networks for generalized protein classification and genome-wide applications. Proteins 44(3), 361–369 (2001)

    Article  Google Scholar 

  7. Nakai, K., Horton, P.: Psort: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 24(1), 34–36 (1999)

    Article  Google Scholar 

  8. Bhasin, M., Garg, A., Raghava, G.P.: Pslpred: prediction of subcellular localization of bacterial proteins. Bioinformatics 21(10), 2522–2524 (2005)

    Article  Google Scholar 

  9. Yu, C.S., Chen, Y.C., Lu, C.H., Hwang, J.K.: Prediction of protein subcellular localization. Proteins 64(3), 643–651 (2006)

    Article  Google Scholar 

  10. Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z.: Svm-prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 31(13), 3692–3697 (2003)

    Article  Google Scholar 

  11. Jensen, L.J., Gupta, R., Staerfeldt, H.H., Brunak, S.: Prediction of human protein function according to gene ontology categories. Bioinformatics 19(5), 635–642 (2003)

    Article  Google Scholar 

  12. Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D., Bairoch, A.: In: Walker JM (ed.) The Proteomics Protocols Handbook, pp. 571–607. Humana Press (2005)

    Google Scholar 

  13. Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: Signalp 3.0. J. Mol. Biol. 340(4), 783–795 (2004)

    Article  Google Scholar 

  14. Gardy, J.L., Laird, M.R., Chen, F., Rey, S., Walsh, C.J., Ester, M., Brinkman, F.S.: Psortb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 21(5), 617–623 (2005)

    Article  Google Scholar 

  15. Gomi, M., Sonoyama, M., Mitaku, S.: High performance system for signal peptide prediction: Sosuisignal. Chem-Bio. Informatics Journal 4(4), 142–147 (2004)

    Article  Google Scholar 

  16. von Heijne, G.: A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14(11), 4683–4690 (1986)

    Article  Google Scholar 

  17. Kall, L., Krogh, A., Sonnhammer, E.L.: A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338(5), 1027–1036 (2004)

    Article  Google Scholar 

  18. Ikeda, M., Arai, M., Lao, D.M., Shimizu, T.: Transmembrane topology prediction methods: a re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. Silico Biol. 2(1), 19–33 (2002)

    Google Scholar 

  19. Hofmann, K., Stoffel, W.: Tmbase - a database of membrane spanning protein segments. Biol. Chem. Hoppe-Seyler 374, 166 (1993)

    Google Scholar 

  20. Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L.: Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 305(3), 567–580 (2001)

    Article  Google Scholar 

  21. Juretic, D., Zoranic, L., Zucic, D.: Basic charge clusters and predictions of membrane protein topology. J. Chem. Inf. Comput. Sci. 42(3), 620–632 (2002)

    Google Scholar 

  22. Tusnady, G.E., Simon, I.: The hmmtop transmembrane topology prediction server. Bioinformatics 17(9), 849–850 (2001)

    Article  Google Scholar 

  23. Jones, D.T., Taylor, W.R., Thornton, J.M.: A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33(10), 3038–3049 (1994)

    Article  Google Scholar 

  24. Cserzo, M., Wallin, E., Simon, I., von Heijne, G., Elofsson, A.: Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng. 10(6), 673–676 (1997)

    Article  Google Scholar 

  25. Kihara, D., Shimizu, T., Kanehisa, M.: Prediction of membrane proteins based on classification of transmembrane segments. Protein Eng. 11(11), 961–970 (1998)

    Article  Google Scholar 

  26. Heijne, G.v.: Membrane protein structure prediction. hydrophobicity analysis and the positive-inside rule. J. Mol. Biol. 225(2), 487–494 (1992)

    Article  Google Scholar 

  27. Deleage, G., Blanchet, C., Geourjon, C.: Protein structure prediction. Implications for the biologist. Biochimie 79(11), 681–686 (1997)

    Article  Google Scholar 

  28. Finn, R.D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R., Eddy, S.R., Sonnhammer, E.L., Bateman, A.: Pfam: clans, web tools and services. Nucleic Acids Res. 34(Database issue), D247–D251 (2006)

    Article  Google Scholar 

  29. Bru, C., Courcelle, E., Carrere, S., Beausse, Y., Dalmar, S., Kahn, D.: The prodom database of protein domain families: more emphasis on 3d. Nucleic Acids Res. 33(Database issue), D212–D215 (2005)

    Article  Google Scholar 

  30. Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., Lopez, R.: Interproscan: protein domains identifier. Nucleic Acids Res. 33(web server issue), W116–W120 (2005)

    Article  Google Scholar 

  31. Letunic, I., Copley, R.R., Schmidt, S., Ciccarelli, F.D., Doerks, T., Schultz, J., Ponting, C.P., Bork, P.: Smart 4.0: towards genomic data integration. Nucleic Acids Res. 32(Database issue), D142–D144 (2004)

    Article  Google Scholar 

  32. Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., Castro, E.D., Langendijk-Genevaux, P.S., Pagni, M., Sigrist, C.J.: The prosite database. Nucleic Acids Res. 34(Database issue), D227–D230 (2006)

    Article  Google Scholar 

  33. Solovyev, V.V., Kolchanov, N.A.: Search for functional sites using consensus. In: Kolchanov, N.A., Lim, H.A. (eds.), pp. 16–21. World Scientific, Singapore (1994)

    Google Scholar 

  34. Castro, E.de., Sigrist, C.J., Gattiker, A., Bulliard, V., Langendijk-Genevaux, P.S., Gasteiger, E., Bairoch, A., Hulo, N.: Scanprosite: detection of prosite signature matches and prorule-associated functional and structural residues in proteins. Nucleic Acids Res. 34(Web Server issue), W362–W365 (2006)

    Article  Google Scholar 

  35. Kelley, L.A., MacCallum, R.M., Sternberg, M.J.: Enhanced genome annotation using structural profiles in the program 3d-pssm. J. Mol. Biol. 299(2), 499–520 (2000)

    Article  Google Scholar 

  36. Schwede, T., Kopp, J., Guex, N., Peitsch, M.C.: Swiss-model: An automated protein homology-modeling server. Nucleic Acids Res. 31(13), 3381–3385 (2003)

    Article  Google Scholar 

  37. Vriend, G.: What if: a molecular modeling and drug design program. J. Mol. Graph 8(1), 29, 52–56 (1990)

    Article  Google Scholar 

  38. Ramachandran, G.N., Ramakrishnan, C., Sasisekharan, V.: Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7, 95–99 (1963)

    Article  Google Scholar 

  39. Laskowski, R.A., Watson, J.D., Thornton, J.M.: Profunc: a server for predicting protein function from 3d structure. Nucleic Acids Res. 33(web server issue), W89–W93 (2005)

    Article  Google Scholar 

  40. Laurie, A.T., Jackson, R.M.: Q-sitefinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 21(9), 1908–1916 (2005)

    Article  Google Scholar 

  41. Liang, S., Zhang, C., Liu, S., Zhou, Y.: Protein binding site prediction using an empirical scoring function. Nucleic Acids Res. 34(13), 3698–3707 (2006)

    Article  Google Scholar 

  42. Jambon, M., Imberty, A., Deleage, G., Geourjon, C.: A new bioinformatic approach to detect common 3d sites in protein structures. Proteins 52(2), 137–145 (2003)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2008 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Nagarajan, V., Elasri, M.O. (2008). Case Study: Structure and Function Prediction of a Protein with No Functionally Characterized Homolog. In: Smolinski, T.G., Milanova, M.G., Hassanien, AE. (eds) Computational Intelligence in Biomedicine and Bioinformatics. Studies in Computational Intelligence, vol 151. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70778-3_16

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-70778-3_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-70776-9

  • Online ISBN: 978-3-540-70778-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics