Skip to main content

Advertisement

Log in

The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors

Semi-supervised classification of class C GPCRs

  • Original Article
  • Published:
Medical & Biological Engineering & Computing Aims and scope Submit manuscript

Abstract

G protein-coupled receptors (GPCRs) are integral cell membrane proteins of relevance for pharmacology. The tertiary structure of the transmembrane domain, a gate to the study of protein functionality, is unknown for almost all members of class C GPCRs, which are the target of the current study. As a result, their investigation must often rely on alignments of their amino acid sequences. Sequence alignment entails the risk of missing relevant information. Various approaches have attempted to circumvent this risk through alignment-free transformations of the sequences on the basis of different amino acid physicochemical properties. In this paper, we use several of these alignment-free methods, as well as a basic amino acid composition representation, to transform the available sequences. Novel semi-supervised statistical machine learning methods are then used to discriminate the different class C GPCRs types from the transformed data. This approach is relevant due to the existence of orphan proteins to which type labels should be assigned in a process of deorphanization or reverse pharmacology. The reported experiments show that the proposed techniques provide accurate classification even in settings of extreme class-label scarcity and that fair accuracy can be achieved even with very simple transformation strategies that ignore the sequence ordering.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7

Similar content being viewed by others

Notes

  1. http://www.uniprot.org/.

References

  1. Alexander SPH, Benson HE, Faccenda E, Pawson AJ, Sharman JL, Spedding M, Peters JA, Harmar AJ (2013) CGTP-collaborators: the concise guide to pharmacology 2013/14: G protein-coupled receptors. Br J Pharmacol 170:1459–1581

    Article  CAS  PubMed  Google Scholar 

  2. Aliferis CF, Statnikov A, Tsamardinos I (2006) Challenges in the analysis of mass-throughput data: a technical commentary from the statistical machine learning perspective. Cancer Inform 2:133–162

    PubMed Central  Google Scholar 

  3. Bengio Y, Delalleau O, Roux NL (2006) Semi-supervised learning, chap. label propagation and quadratic criterion. MIT Press, Cambridge

    Google Scholar 

  4. Bishop CM, Svensén M, Williams CKI (1998) GTM: the generative topographic mapping. Neural Comput 10:215–234

    Article  Google Scholar 

  5. Branden C, Tooze J (1991) Introduction to protein structure. Garland Publishing, USA

    Google Scholar 

  6. Cárdenas MI, Vellido A, Olier I, Rovira X, Giraldo J (2012) Complementing kernel-based visualization of protein sequences with their phylogenetic tree. In: Lecture notes in bioinformatics (LNCS/LNBI), vol 7548, pp 136–149

  7. Cruz-Barbosa R, Vellido A (2010) Semi-supervised geodesic generative topographic mapping. Pattern Recognit Lett 31:202–209

    Article  Google Scholar 

  8. Cruz-Barbosa R, Vellido A (2011) Semi-supervised analysis of human brain tumours from partially labeled MRS information, using manifold learning models. Int J Neural Syst 21:17–29

    Article  PubMed  Google Scholar 

  9. Cruz-Barbosa R, Vellido A, Giraldo J (2013) Advances in semi-supervised alignment-free classification of G protein-coupled receptors. In: Proceedings of the international work-conference on bioinformatics and biomedical engineering (IWBBIO’13), pp 759–766

  10. Davies MN, Secker A, Freitas AA, Mendao M, Timmis J, Flower DR (2007) On the hierarchical classification of G protein-coupled receptors. Bioinformatics 23(23):3113–3118

    Article  CAS  PubMed  Google Scholar 

  11. Doré AS, Okrasa K, Patel JC, Serrano-Vega M, Bennett K, Cooke RM, Errey JC, Jazayeri A, Khan S, Tehan B, Weir M, Wiggin GR, Marshall FH (2014) Structure of class C GPCR metabotropic glutamate receptor 5 transmembrane domain. Nature 551:557–562

    Article  Google Scholar 

  12. Foord SM, Bonner TI, Neubig RR, Rosser EM, Pin JP, Davenport AP, Spedding M, Harmar AJ (2005) International union of pharmacology. XLVI. G protein-coupled receptor list. Pharmacol Rev 57(2):279–288

    Article  CAS  PubMed  Google Scholar 

  13. Fredriksson R, Lagerström MC, Lundin LG, Schiöth HB (2003) The G-protein-coupled receptors in the human genome form five main families. Phylogenetic analysis, paralogon groups, and fingerprints. Mol Pharmacol 63:1256–1272

    Article  CAS  PubMed  Google Scholar 

  14. Gorodkin J (2004) Comparing two K-category assignments by a K-category correlation coefficient. Comput Biol Chem 28:367–374

    Article  CAS  PubMed  Google Scholar 

  15. Herrmann L, Ultsch A (2007) Label propagation for semi-supervised learning in self-organizing maps. In: Proceedings of the 6th international workshop on self-organizing maps (WSOM)

  16. Hollenstein K, Kean J, Bortolato A, Cheng RK, Doré AS, Jazayeri A, Cooke RM, Weir M, Marshall FH (2013) Structure of class B GPCR corticotropin-releasing factor receptor 1. Nature (2013). doi:10.1038/nature12357

  17. Jurman G, Riccadonna S, Furlanello C (2012) A comparison of MCC and CEN error measures in multi-class prediction. PLoS ONE 7(8):e4,1882

    Article  CAS  Google Scholar 

  18. Karchin R, Karplus K, Haussler D (2002) Classifying G-protein coupled receptors with support vector machines. Bioinformatics 18:147–159

    Article  CAS  PubMed  Google Scholar 

  19. Katritch V, Cherezov V, Stevens RC (2013) Structure-function of the G protein-coupled receptor superfamily. Annu Rev Pharmacol Toxicol 53:531–556

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Kim J, Moriyama EN, Warr CG, Clyne PJ, Carlson JR (2000) Identification of novel multi-transmembrane proteins from genomic databases using quasi-periodic structural properties. Bioinformatics 16:767–775

    Article  CAS  PubMed  Google Scholar 

  21. Kniazeff J, Prézeau L, Rondard P, Pin JP, Goudet C (2011) Dimers and beyond: the functional puzzles of class C GPCRs. Pharmacol Ther 130:9–25

    Article  CAS  PubMed  Google Scholar 

  22. Lapinsh M, Gutcaits A, Prusis P, Post C, Lundstedt T, Wikberg JES (2002) Classification of G-protein coupled receptors by alignment-independent extraction of principal chemical properties of primary amino acid sequences. Protein Sci 11:795–805

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  23. Liu B, Wang X, Chen Q, Dong Q, Lan X (2012) Using amino acid physicochemical distance transformation for fast protein remote homology detection. PLoS ONE 7:e46,633

    Article  CAS  Google Scholar 

  24. Matthews B (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochim Biophys Acta Protein Struct 405:442–451

    Article  CAS  Google Scholar 

  25. Oh DY, Kim K, Kwon HB, Seong JY (2006) Cellular and molecular biology of orphan G protein-coupled receptors. Int Rev Cytol 252:163–218

    Article  CAS  PubMed  Google Scholar 

  26. Opiyo SO, Moriyama EN (2007) Protein family classification with partial least squares. J Proteome Res 6:846–853

    Article  CAS  PubMed  Google Scholar 

  27. Otaki JM, Mori A, Itoh Y, Nakayama T, Yamamoto H (2006) Alignment-free classification of G-protein-coupled receptors using self-organizing maps. J Chem Inf Model 46:1479–1490

    Article  CAS  PubMed  Google Scholar 

  28. Overington JP, Al-Lazikani B, Hopkins AL (2006) How many drug targets are there? Nat Rev Drug Discov 5:993–996

    Article  CAS  PubMed  Google Scholar 

  29. Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H et al (2000) Crystal structure of rhodopsin: a G protein-coupled receptor. Science 289:739–45

    Article  CAS  PubMed  Google Scholar 

  30. Pin JP, Galvez T, Prézeau L (2003) Evolution, structure, and activation mechanism of family 3/C G-protein-coupled receptors. Pharmacol Ther 98:325–354

    Article  CAS  PubMed  Google Scholar 

  31. Rask-Andersen M, Sällman-Almén M, Schiöth HB (2011) Trends in the exploitation of novel drug targets. Nat Rev Drug Discov 10:579–590

    Article  CAS  PubMed  Google Scholar 

  32. Sandberg M, Eriksson L, Jonsson J, Sjöström M, Wold S (1998) New chemical descriptors relevant for the design of biologically active peptides. A multivariate characterization of 87 amino acids. J Med Chem 41:2481–2491

    Article  CAS  PubMed  Google Scholar 

  33. Siu FY, He M, de Graaf C, Han GW, Yang D, Zhang Z, Zhou C, Xu Q, Wacker D, Joseph JS, Liu W, Lau J, Cherezov V, Katritch V, Wang MW, Stevens RC (2013) Structure of the human glucagon class B G-protein-coupled receptor. Nature. doi:10.1038/nature12393

    Google Scholar 

  34. Stevens RC, Cherezov V, Katritch V, Abagyan R, Kuhn P, Rosen H, Wüthrich K (2013) The GPCR network: a large-scale collaboration to determine human GPCR structure and function. Nat Rev Drug Discov 12:25–34

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  35. Vellido A, Cárdenas MI, Olier I, Rovira X, Giraldo J (2011) A probabilistic approach to the visual exploration of G protein-coupled receptor sequences. In: Proceedings of the 19th European symposium on artificial neural networks (ESANN 2011), pp 233–238

  36. Vroling B, Sanders M, Baakman C, Borrmann A, Verhoeven S, Klomp J, Oliveira L, de Vlieg J, Vriend G (2011) GPCRDB: information system for G protein-coupled receptors. Nucl Acids Res 39(Suppl 1):D309–D319

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  37. Wacker D, Wang C, Katritch V, Han GW, Huang XP, Vardy E, McCorvy JD, Jiang Y, Chu M, Siu FY, Liu W, Xu HE, Cherezov V, Roth BL, Stevens RC (2013) Structural features for functional selectivity at serotonin receptors. Science 340:615–619. doi:10.1126/science.1232808

  38. Wang C, Jiang Y, Ma J, Wu H, Wacker D, Katritch V, Han GW, Liu W, Huang XP, Vardy E, McCorvy JD, Gao X, Zhou EX, Melcher K, Zhang C, Bai F, Yang H, Yang L, Jiang H, Roth BL, Cherezov V, Stevens RC, Xu HE (2013) Structural basis for molecular recognition at serotonin receptors. Science 340:610–614. doi:10.1126/science.1232807

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  39. Wang C, Wu H, Katritch V, Han GW, Huang XP, Liu W, Siu FY, Roth BL, Cherezov V, Stevens RC (2013) Structure of the human smoothened receptor bound to an antitumour agent. Nature 497(7449):338–343. doi:10.1038/nature12167

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  40. Wei JM, Yuang XJ, Hu QH, Wang SQ (2010) A novel measure for evaluating classifiers. Expert Syst Appl 37:3799–3809

    Article  Google Scholar 

  41. Wold S, Jonsson J, Sjöström M, Sandberg M, Rännar S (1993) DNA and peptide sequences and chemical processes multivariately modelled by principal component analysis and partial least-squares projections to latent structures. Anal Chim Acta 277:239–253

    Article  CAS  Google Scholar 

  42. Wu H, Wang C, Gregory KJ, Han GW, Cho HP, Xia Y, Niswender CM, Katritch V, Meiler J, Cherezov V, Conn PJ, Stevens RC (2014) Structure of a class C GPCR metabotropic glutamate receptor 1 bound to an allosteric modulator. Science 344(6179):58–64

    Article  CAS  PubMed  Google Scholar 

  43. Wu Z, Li CH, Zhu J, Huang J (2006) A semi-supervised SVM for manifold learning. In: Proceedings of the 18th international conference on pattern recognition (ICPR)

  44. Zhu X, Ghahramani Z (2002) Learning from labeled and unlabeled data with label propagation. Technical report CMU-CALD-02-107, Carnegie Mellon University, PA, USA

Download references

Acknowledgments

R. Cruz-Barbosa acknowledges the Mexican National Council for Science and Technology for his postdoctoral fellowship. This research is partially funded by Spanish research Projects TIN2012-31377, SAF2010-19257, Fundació La Marató de TV3 (110230), RecerCaixa 2010ACUP 00378 and ERA-NET NEURON PCIN-2013-018-C03-02.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Raúl Cruz-Barbosa.

Electronic supplementary material

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Cruz-Barbosa, R., Vellido, A. & Giraldo, J. The influence of alignment-free sequence representations on the semi-supervised classification of class C G protein-coupled receptors. Med Biol Eng Comput 53, 137–149 (2015). https://doi.org/10.1007/s11517-014-1218-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11517-014-1218-y

Keywords

Navigation