Case Study: Structure and Function Prediction of a Protein with No Functionally Characterized Homolog

Nagarajan, Vijayaraj; Elasri, Mohamed O.

doi:10.1007/978-3-540-70778-3_16

Vijayaraj Nagarajan⁵ &
Mohamed O. Elasri⁵

Part of the book series: Studies in Computational Intelligence ((SCI,volume 151))

897 Accesses

Summary

The post-genomic era has seen a significant increase in the use of computational prediction methods to gain insights into structure and function of proteins. Prediction tools are used to guide the experimental design to test various hypotheses about structure and function of known proteins. However, these tools are particularly useful when studying putative protein sequences with no known function. The genomic era produced a large number of sequences that are described as either hypothetical proteins or as proteins with unknown function. Current molecular biology techniques are not adequate to efficiently study this vast reservoir of genetic information. However, computer algorithms can process large amounts of sequence data to predict structure and function. These knowledge-based computational tools use available experimental data and are regularly updated to improve their predictive power. The simplest form of function prediction is achieved by comparison of the query sequence to all available sequences using BLAST. If the query sequence is highly similar to previously characterized proteins, then it is likely that the query sequence has similar functions. However, if the query sequence does not have any homologous sequence with known function, then more sophisticated computational tools are necessary to gain insight into structure and function. Various methods have been developed to search for known domains, motifs, patterns, or profiles. The quality of predictions is dependent on the type of tools used and is limited to the closeness of the query sequence to known proteins.

In this chapter, we will describe and discuss methods and tools we used to predict structure and function of a putative protein sequence (Msa) with unknown function. We will address the advantages and limitations of all these approaches by using the Msa protein from the human pathogen Staphylococcus aureus as a case study. Msa is a novel protein that is involved in regulation of virulence. Since Msa has no known homolog, computational tools are being used to predict its structure and mechanism of action. These predictions are used to design experiments to study Msa and explore its use as a therapeutic target to combat antibiotic-resistant infections.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Hardcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J.: Gapped blast and psi-blast: a new generation of protein database search programs. Nucleic Acids Res 25(17), 3389–3402 (1997)
Article Google Scholar
Pandey, G., Kumar, V., Steinbach, M.: Computational approaches for protein function prediction: A survey. Tech. Rep. TR 06-028, Department of Computer Science and Engineering, University of Minnesota (2006)
Google Scholar
Sambanthamoorthy, K., Smeltzer, M.S., Elasri, M.O.: Identification and characterization of msa (sa1233), a gene involved in expression of sara and several virulence factors in staphylococcus aureus. Microbiology 152(Pt 9), 2559–2572 (2006)
Article Google Scholar
Nagarajan, V., Elasri, M.O.: Structure and function predictions of the msa protein in staphylococcus aureus. BMC Bioinformatics 8(suppl 7), S5 (2007)
Article Google Scholar
Matsuda, S., Vert, J.P., Saigo, H., Ueda, N., Toh, H., Akutsu, T.: A novel representation of protein sequences for prediction of subcellular location using support vector machines. Protein Sci. 14(11), 2804–2813 (2005)
Article Google Scholar
Pasquier, C., Promponas, V.J., Hamodrakas, S.J.: Pred-class: cascading neural networks for generalized protein classification and genome-wide applications. Proteins 44(3), 361–369 (2001)
Article Google Scholar
Nakai, K., Horton, P.: Psort: a program for detecting sorting signals in proteins and predicting their subcellular localization. Trends Biochem Sci 24(1), 34–36 (1999)
Article Google Scholar
Bhasin, M., Garg, A., Raghava, G.P.: Pslpred: prediction of subcellular localization of bacterial proteins. Bioinformatics 21(10), 2522–2524 (2005)
Article Google Scholar
Yu, C.S., Chen, Y.C., Lu, C.H., Hwang, J.K.: Prediction of protein subcellular localization. Proteins 64(3), 643–651 (2006)
Article Google Scholar
Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, X., Chen, Y.Z.: Svm-prot: Web-based support vector machine software for functional classification of a protein from its primary sequence. Nucleic Acids Res. 31(13), 3692–3697 (2003)
Article Google Scholar
Jensen, L.J., Gupta, R., Staerfeldt, H.H., Brunak, S.: Prediction of human protein function according to gene ontology categories. Bioinformatics 19(5), 635–642 (2003)
Article Google Scholar
Gasteiger, E., Hoogland, C., Gattiker, A., Duvaud, S., Wilkins, M.R., Appel, R.D., Bairoch, A.: In: Walker JM (ed.) The Proteomics Protocols Handbook, pp. 571–607. Humana Press (2005)
Google Scholar
Bendtsen, J.D., Nielsen, H., von Heijne, G., Brunak, S.: Improved prediction of signal peptides: Signalp 3.0. J. Mol. Biol. 340(4), 783–795 (2004)
Article Google Scholar
Gardy, J.L., Laird, M.R., Chen, F., Rey, S., Walsh, C.J., Ester, M., Brinkman, F.S.: Psortb v.2.0: expanded prediction of bacterial protein subcellular localization and insights gained from comparative proteome analysis. Bioinformatics 21(5), 617–623 (2005)
Article Google Scholar
Gomi, M., Sonoyama, M., Mitaku, S.: High performance system for signal peptide prediction: Sosuisignal. Chem-Bio. Informatics Journal 4(4), 142–147 (2004)
Article Google Scholar
von Heijne, G.: A new method for predicting signal sequence cleavage sites. Nucleic Acids Res. 14(11), 4683–4690 (1986)
Article Google Scholar
Kall, L., Krogh, A., Sonnhammer, E.L.: A combined transmembrane topology and signal peptide prediction method. J. Mol. Biol. 338(5), 1027–1036 (2004)
Article Google Scholar
Ikeda, M., Arai, M., Lao, D.M., Shimizu, T.: Transmembrane topology prediction methods: a re-assessment and improvement by a consensus method using a dataset of experimentally-characterized transmembrane topologies. Silico Biol. 2(1), 19–33 (2002)
Google Scholar
Hofmann, K., Stoffel, W.: Tmbase - a database of membrane spanning protein segments. Biol. Chem. Hoppe-Seyler 374, 166 (1993)
Google Scholar
Krogh, A., Larsson, B., von Heijne, G., Sonnhammer, E.L.: Predicting transmembrane protein topology with a hidden markov model: application to complete genomes. J. Mol. Biol. 305(3), 567–580 (2001)
Article Google Scholar
Juretic, D., Zoranic, L., Zucic, D.: Basic charge clusters and predictions of membrane protein topology. J. Chem. Inf. Comput. Sci. 42(3), 620–632 (2002)
Google Scholar
Tusnady, G.E., Simon, I.: The hmmtop transmembrane topology prediction server. Bioinformatics 17(9), 849–850 (2001)
Article Google Scholar
Jones, D.T., Taylor, W.R., Thornton, J.M.: A model recognition approach to the prediction of all-helical membrane protein structure and topology. Biochemistry 33(10), 3038–3049 (1994)
Article Google Scholar
Cserzo, M., Wallin, E., Simon, I., von Heijne, G., Elofsson, A.: Prediction of transmembrane alpha-helices in prokaryotic membrane proteins: the dense alignment surface method. Protein Eng. 10(6), 673–676 (1997)
Article Google Scholar
Kihara, D., Shimizu, T., Kanehisa, M.: Prediction of membrane proteins based on classification of transmembrane segments. Protein Eng. 11(11), 961–970 (1998)
Article Google Scholar
Heijne, G.v.: Membrane protein structure prediction. hydrophobicity analysis and the positive-inside rule. J. Mol. Biol. 225(2), 487–494 (1992)
Article Google Scholar
Deleage, G., Blanchet, C., Geourjon, C.: Protein structure prediction. Implications for the biologist. Biochimie 79(11), 681–686 (1997)
Article Google Scholar
Finn, R.D., Mistry, J., Schuster-Bockler, B., Griffiths-Jones, S., Hollich, V., Lassmann, T., Moxon, S., Marshall, M., Khanna, A., Durbin, R., Eddy, S.R., Sonnhammer, E.L., Bateman, A.: Pfam: clans, web tools and services. Nucleic Acids Res. 34(Database issue), D247–D251 (2006)
Article Google Scholar
Bru, C., Courcelle, E., Carrere, S., Beausse, Y., Dalmar, S., Kahn, D.: The prodom database of protein domain families: more emphasis on 3d. Nucleic Acids Res. 33(Database issue), D212–D215 (2005)
Article Google Scholar
Quevillon, E., Silventoinen, V., Pillai, S., Harte, N., Mulder, N., Apweiler, R., Lopez, R.: Interproscan: protein domains identifier. Nucleic Acids Res. 33(web server issue), W116–W120 (2005)
Article Google Scholar
Letunic, I., Copley, R.R., Schmidt, S., Ciccarelli, F.D., Doerks, T., Schultz, J., Ponting, C.P., Bork, P.: Smart 4.0: towards genomic data integration. Nucleic Acids Res. 32(Database issue), D142–D144 (2004)
Article Google Scholar
Hulo, N., Bairoch, A., Bulliard, V., Cerutti, L., Castro, E.D., Langendijk-Genevaux, P.S., Pagni, M., Sigrist, C.J.: The prosite database. Nucleic Acids Res. 34(Database issue), D227–D230 (2006)
Article Google Scholar
Solovyev, V.V., Kolchanov, N.A.: Search for functional sites using consensus. In: Kolchanov, N.A., Lim, H.A. (eds.), pp. 16–21. World Scientific, Singapore (1994)
Google Scholar
Castro, E.de., Sigrist, C.J., Gattiker, A., Bulliard, V., Langendijk-Genevaux, P.S., Gasteiger, E., Bairoch, A., Hulo, N.: Scanprosite: detection of prosite signature matches and prorule-associated functional and structural residues in proteins. Nucleic Acids Res. 34(Web Server issue), W362–W365 (2006)
Article Google Scholar
Kelley, L.A., MacCallum, R.M., Sternberg, M.J.: Enhanced genome annotation using structural profiles in the program 3d-pssm. J. Mol. Biol. 299(2), 499–520 (2000)
Article Google Scholar
Schwede, T., Kopp, J., Guex, N., Peitsch, M.C.: Swiss-model: An automated protein homology-modeling server. Nucleic Acids Res. 31(13), 3381–3385 (2003)
Article Google Scholar
Vriend, G.: What if: a molecular modeling and drug design program. J. Mol. Graph 8(1), 29, 52–56 (1990)
Article Google Scholar
Ramachandran, G.N., Ramakrishnan, C., Sasisekharan, V.: Stereochemistry of polypeptide chain configurations. J. Mol. Biol. 7, 95–99 (1963)
Article Google Scholar
Laskowski, R.A., Watson, J.D., Thornton, J.M.: Profunc: a server for predicting protein function from 3d structure. Nucleic Acids Res. 33(web server issue), W89–W93 (2005)
Article Google Scholar
Laurie, A.T., Jackson, R.M.: Q-sitefinder: an energy-based method for the prediction of protein-ligand binding sites. Bioinformatics 21(9), 1908–1916 (2005)
Article Google Scholar
Liang, S., Zhang, C., Liu, S., Zhou, Y.: Protein binding site prediction using an empirical scoring function. Nucleic Acids Res. 34(13), 3698–3707 (2006)
Article Google Scholar
Jambon, M., Imberty, A., Deleage, G., Geourjon, C.: A new bioinformatic approach to detect common 3d sites in protein structures. Proteins 52(2), 137–145 (2003)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biological Sciences, The University of Southern Mississippi, Hattiesburg, MS 39406, USA
Vijayaraj Nagarajan & Mohamed O. Elasri

Authors

Vijayaraj Nagarajan
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed O. Elasri
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Biology, Emory University, 1510 Clifton Rd.NE, 30322, Atlanta, Georgia, USA
Tomasz G. Smolinski
Department of Computer Science, University of Arkansas at Little Rock, 2801 S.University Ave., 72204, Little Rock, Arkansas, USA
Mariofanna G. Milanova
Department of Quantitative Methods and Information Systems College of Business and Administration, Kuwait University, P.O. Box 5486, 13055, Safat, Kuwait
Aboul-Ella Hassanien

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Nagarajan, V., Elasri, M.O. (2008). Case Study: Structure and Function Prediction of a Protein with No Functionally Characterized Homolog. In: Smolinski, T.G., Milanova, M.G., Hassanien, AE. (eds) Computational Intelligence in Biomedicine and Bioinformatics. Studies in Computational Intelligence, vol 151. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-70778-3_16

Download citation

DOI: https://doi.org/10.1007/978-3-540-70778-3_16
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-70776-9
Online ISBN: 978-3-540-70778-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics