Skip to main content
Log in

Domain-Specific Proteogenomic Analysis of Collagens to Evaluate De Novo Sequencing Results and Database Information

  • Original Article
  • Published:
Journal of Molecular Evolution Aims and scope Submit manuscript

Abstract

Collagen is an important structural protein and the most abundant protein in mammals. In several research fields, structural analysis of collagens is performed. Fibrillar collagens almost entirely consist of continuous repeats of GXY, where G is glycine, X is often proline or alanine and Y is often hydroxyproline or alanine. In the present study, the collagen structure was investigated in detail at the nucleotide, codon group, amino acid and target peptide level using sequence analyses. One of the most important findings was that a selection of codon groups is predominantly involved in amino acid changes between closely related collagens and that other change routes come up when collagens are less related. The findings of the sequence analyses were used to evaluate reported sequences of non-avian dinosaur species and database entries of duck and chicken collagen. The duck assessment was supported by an experimental data set, obtained by collagen extraction from duck skin and subsequent digestion and LC–MS analysis. It was found that database entries of chicken and duck collagen 3α1 contained unreliable features, such as missing parts, no continuous GXY pattern and too many interspecies differences. As an example, the erroneous nature of one of these unreliable features was confirmed experimentally using LC–MS. Finally, dino and bird collagen 1α1 were compared. The presented results will show that performing a domain-specific proteogenomic analysis provides very useful information to assess de novo sequencing results and database information of collagens. Furthermore, it offers deeper insight in the functional restrictions and routes of evolutionary divergence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  • Almeida PF, da Silva Lannes SC (2013) Extraction and physicochemical characterization of gelatin from chicken by-product. J Food Process Eng 36:824–833

    Article  CAS  Google Scholar 

  • Asara JM, Schweitzer MH, Freimark LM, Phillips M, Cantley LC (2007) Protein sequences from Mastodon and Tyrannosaurus rex revealed by mass spectrometry. Science 316:280–285

    Article  PubMed  CAS  Google Scholar 

  • Buckley M et al (2008) Comment on “protein sequences from Mastodon and Tyrannosaurus rex revealed by mass spectrometry”. Science 319:33

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Buckley M, Warwood S, van Dongen B, Kitchener AC, Manning PL (2017) A fossil protein chimera; difficulties in discriminating dinosaur peptide sequences from modern cross-contamination. Proc R Soc B 284: 20170544. https://doi.org/10.1098/rspb.2017.0544

  • Chen L, Liu P, Evans TC, Ettwiller LM (2017) DNA damage is a pervasive cause of sequencing errors, directly confounding variant identification. Science 355:752–756

    Article  PubMed  CAS  Google Scholar 

  • Cloudsley-Thompson JL (2005) Ecology and behaviour of Mesozoic reptiles. Springer. ISBN 978-3-540-26571-9

  • Di Lullo GA, Sweeney SM, Körkkö J, Ala-Kokko L, San Antonio JD (2002) Mapping the ligand-binding sites and disease-associated mutations on the most abundant protein in the human, type i collagen. J Biol Chem 277:4223–4231

    Article  PubMed  CAS  Google Scholar 

  • Exposito J, Valcourt U, Cluzel C, Lethias C (2010) The fibrillar collagen family. Int J Mol Sci 11:407–426

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Godefroit P, Cau A, Dong-Yu H, Escuillié F, Wenhao W, Dyke G (2013) A jurassic avialan dinosaur from China resolves the early phylogenetic history of birds. Nature 498:359–362

    Article  PubMed  CAS  Google Scholar 

  • Han S, Makareeva E, Kuznetsova NV, DeRidder AM, Sutter MB, Losert W, Phillips CL, Visse R, Nagase H, Leikin S (2010) Molecular mechanism of type I collagen homotrimer resistance to mammalian collagenases. J Biol Chem 285:22276–22281

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Kang AH, Dixit SN, Corbett C, Gross J (1975) The covalent structure of collagen. Amino acid sequence of alpha1-CB5 glycopeptide and alpha1-CB4 from chick skin collagen. J Biol Chem 250:7428–7434

    PubMed  CAS  Google Scholar 

  • Karsdal MA, Leeming DJ, Henriksen K, Bay-Jensen A (2017) Biochemistry of collagens, laminins and elastin. Structure, function and biomarkers. Elsevier Academic Press. ISBN: 978-0-12-809847-9

  • Kleinnijenhuis AJ (2017) Domain-specific analysis of collagen code. http://www.slideshare.net/AnneKleinnijenhuis/domain-specific-analysis-of-collagen-code

  • Kleinnijenhuis AJ, van Holthoon FL, Herregods G (2018) Validation and theoretical justification of an LC–MS method for the animal species specific detection of gelatin. Food Chem 243:461–467

    Article  PubMed  CAS  Google Scholar 

  • Larance M, Lamond AI (2015) Multidimensional proteomics for cell biology. Nat Rev Mol Cell Biol 16:269–280

    Article  PubMed  CAS  Google Scholar 

  • Lodish H, Berk A, Zipursky SL, Matsudaira P, Baltimore D, Darnell J (2000) Molecular cell biology, 4th edn. New York: WH, Freeman, ISBN-10: 0-7167-3136-3

  • Mertins P et al (2016) Proteogenomics connects somatic mutations to signalling in breast cancer. Nature 534:55–62

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Nesvizhskii AI (2014) Proteogenomics: concepts, applications and computational strategies. Nat Methods 11:1114–1125

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Persikov AV, Pillitteri RJ, Amin P, Schwarze U, Byers PH, Brodsky B (2004) Stability related bias in residues replacing glycines within the collagen triple helix (Gly-Xaa-Yaa) in inherited connective tissue disorders. Hum Mutat 24:330–337

    Article  PubMed  CAS  Google Scholar 

  • Pevzner PA, Kim S, Ng J (2008) Comment on “protein sequences from Mastodon and Tyrannosaurus rex revealed by mass spectrometry”. Science 321:1040b

    Article  CAS  Google Scholar 

  • Primrose S, Woolfe M, Rollinson S (2010) Food forensics: methods for determining the authenticity of foodstuffs. Trends Food Sci Technol 21:582–590

    Article  CAS  Google Scholar 

  • Schroeter ER, DeHart CJ, Cleland TP, Zheng W, Thomas PM, Kelleher NL, Bern M, Schweitzer MH (2017) Expansion for the Brachylophosaurus canadensis collagen i sequence and additional evidence of the preservation of cretaceous protein. J Proteome Res 16:920–932

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Schweitzer M, Zheng W, Organ C, Avci R, Suo Z, Freimark L, Lebleu V, Duncan M, Vander Heiden M, Neveu J, Lane W, Cottrell J, Horner J, Cantley L, Kalluri R, Asara J (2009) Biomolecular characterization and protein sequences of the Campanian hadrosaur B. canadensis. Science 324:626–631

    Article  PubMed  CAS  Google Scholar 

  • Slatter DA, Farndale RW (2015) Structural constraints on the evolution of the collagen fibril: convergence on a 1014-residue COL domain. Open Biol 5:1–7

    Article  CAS  Google Scholar 

  • Stinson RH, Sweeny PR, Hendricks RW (1979) Experimental confirmation of calculated phases and electron density profile for wet native collagen. Biophys J 26:209–222

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Suzuki N, Nawa D, Su TH, Lin CW, Khoo KH, Yamamoto K (2013) Distribution of the Galβ1-4Gal epitope among birds: species-specific loss of the glycan structure in chicken and its relatives. PLoS ONE 8:e59291

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Szpak P (2011) Fish bone chemistry and ultrastructure: implications for taphonomy and stable isotope analysis. J Archaeol Sci 38:3358–3372

    Article  Google Scholar 

  • Tromp G, Kuivaniemi H, Stacey A, Shikata H, Baldwin CT, Jaenisch R, Prockop DJ (1988) Structure of a full-length cDNA clone for the preproα1(I) chain of human type I procollagen. Biochem J 253:919–922

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  • Web references: http://www.ebi.ac.uk/ena, http://www.uniprot.org, https://blast.ncbi.nlm.nih.gov/Blast.cgi, http://www.kazusa.or.jp/codon, http://www.fr33.net/translator.php, http://web.expasy.org/sim

Download references

Acknowledgements

The research was performed in Triskelion study 20959 and was financed by Triskelion. Anne Schulp (Naturalis, Leiden, the Netherlands) provided helpful feedback on an earlier version of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anne J. Kleinnijenhuis.

Ethics declarations

Conflict of interest

The authors declare no conflicts of interest.

Animal Rights

The animal material used during the study (skin from duck) was purchased at a local supermarket.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 15 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Kleinnijenhuis, A.J., van Holthoon, F.L. Domain-Specific Proteogenomic Analysis of Collagens to Evaluate De Novo Sequencing Results and Database Information. J Mol Evol 86, 293–302 (2018). https://doi.org/10.1007/s00239-018-9844-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00239-018-9844-x

Keywords

Navigation