Skip to main content

Computational Methods of Identification of Pseudogenes Based on Functionality: Entropy and GC Content

  • Protocol
  • First Online:
Pseudogenes

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1167))

Abstract

Spectral entropy and GC content analyses reveal comprehensive structural features of DNA sequences. To illustrate the significance of these features, we analyze the β-esterase gene cluster, including the Est-6 gene and the ψEst-6 putative pseudogene, in seven species of the Drosophila melanogaster subgroup. The spectral entropies show distinctly lower structural ordering for ψEst-6 than for Est-6 in all species studied. However, entropy accumulation is not a completely random process for either gene and it shows to be nucleotide dependent. Furthermore, GC content in synonymous positions is uniformly higher in Est-6 than in ψEst-6, in agreement with the reduced GC content generally observed in pseudogenes and nonfunctional sequences. The observed differences in entropy and GC content reflect an evolutionary shift associated with the process of pseudogenization and subsequent functional divergence of ψEst-6 and Est-6 after the duplication event. The data obtained show the relevance and significance of entropy and GC content analyses for pseudogene identification and for the comparative study of gene–pseudogene evolution.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Harrison PM, Echols N, Gerstein MB (2001) Digging for dead genes: analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. Nucleic Acids Res 29:818–830

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  2. Harrison PM, Hegyi H, Balasubramanian S, Luscombe NM, Bertone P, Echols N, Johnson T, Gerstein M (2002) Molecular fossils in the human genome: identification and analysis of the pseudogenes in chromosomes 21 and 22. Genome Res 12:272–280

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  3. Harrison PM, Milburn D, Zhang Z, Bertone P, Gerstein M (2003) Identification of pseudogenes in the Drosophila melanogaster genome. Nucleic Acids Res 31:1033–1037

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  4. Sakai H, Koyanagi KO, Itoh T, Imanishi T, Gojobori T (2003) Detection of processed pseudogenes based on cDNA mapping to the human genome. Genome Informatics 14:452–453

    Google Scholar 

  5. Coin L, Durbin R (2004) Improved techniques for the identification of pseudogenes. Bioinformatics 20(Suppl 1):i94–i100

    Article  CAS  PubMed  Google Scholar 

  6. Zhang Z, Carriero N, Gerstein M (2004) Comparative analysis of processed pseudogenes in the mouse and human genomes. Trends Genet 20:62–67

    Article  PubMed  CAS  Google Scholar 

  7. Zhang Z, Carriero N, Zheng D, Karro J, Harrison PM, Gerstein M (2006) PseudoPipe: an automated pseudogene identification pipeline. Comput Appl Biosci 22:1437–1439

    CAS  Google Scholar 

  8. Bischof JM, Chiang AP, Scheetz TE, Stone EM, Casavant TL, Sheffield VC, Braun TA (2006) Genome-wide identification of pseudogenes capable of disease-causing gene conversion. Hum Mutat 27:545–552

    Article  CAS  PubMed  Google Scholar 

  9. Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE, Guigo R (2006) GENCODE: producing a reference annotation for ENCODE. Genome Biol 7(Suppl 1):S4

    Article  PubMed Central  PubMed  Google Scholar 

  10. Menashe I, Aloni R, Lancet D (2006) A probabilistic classifier for olfactory receptor pseudogenes. BMC Bioinformatics 7:393

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  11. Solovyev V, Kosarev P, Seledsov I, Vorobyev D (2006) Automatic annotation of eukaryotic genes, pseudogenes and promoters. Genome Biol 7(Suppl 1):S10–S12

    Article  PubMed Central  PubMed  Google Scholar 

  12. van Baren MJ, Brent MR (2006) Iterative gene prediction and pseudogene removal improves genome annotation. Genome Res 16:678–685

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  13. Zheng D, Gerstein MB (2006) A computational approach for identifying pseudogenes in the ENCODE regions. Genome Biol 7(Suppl 1):S13–S20

    Article  PubMed Central  PubMed  Google Scholar 

  14. Zheng D, Frankish A, Baertsch R, Kapranov P, Reymond A, Choo SW, Lu Y, Denoeud F, Antonarakis SE, Snyder M, Ruan Y, Wei CL, Gingeras TR, Guigó R, Harrow J, Gerstein MB (2007) Pseudogenes in the ENCODE regions: consensus annotation, analysis of transcription, and evolution. Genome Res 17:839–851

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  15. Ortutay C, Vihinen M (2008) PseudoGeneQuest: service for identification of different pseudogene types in the human genome. BMC Bioinformatics 9:299

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  16. Molineris I, Sales G, Bianchi F, di Cunto F, Caselle M (2010) A new approach for the identification of processed pseudogenes. J Comput Biol 17:755–765

    Article  CAS  PubMed  Google Scholar 

  17. Rouchka EC, Cha IE (2009) Current trends in pseudogene detection and characterization. Curr Bioinformatics 4:112–119

    Article  CAS  Google Scholar 

  18. Chen S-M, Ma K-Y, Zeng J (2011) Pseudogene: lessons from PCR bias, identification and resurrection. Mol Biol Rep 38:3709–3715

    Article  CAS  PubMed  Google Scholar 

  19. Karro JE, Yan Y, Zheng D, Zhang Z, Carriero N, Cayting P, Harrrison P, Gerstein M (2007) Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation. Nucleic Acids Res 35:D55–D60

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Pavlicek A, Paces J, Zika R, Hejnar J (2002) Length distribution of long interspersed nucleotide elements (LINEs) and processed pseudogenes of human endogenous retroviruses: implications for retrotransposition and pseudogene detection. Gene 300:189–194

    Article  CAS  PubMed  Google Scholar 

  21. Balakirev ES, Ayala FJ (1996) Is esterase-P encoded by a cryptic pseudogene in Drosophila melanogaster? Genetics 144:1511–1518

    CAS  PubMed Central  PubMed  Google Scholar 

  22. Leveugle M, Prat K, Perrier N, Birnbaum D, Coulier F (2003) ParaDB: a tool for paralogy mapping in vertebrate genomes. Nucleic Acids Res 31:63–67

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  23. Sakharkar KR, Chaturvedi I, Chow VT, Kwoh CK, Kangueane P, Sakharkar MK (2005) u-Genome: a database on genome design in unicellular genomes. In Silico Biol 5:611–615

    CAS  PubMed  Google Scholar 

  24. Balakirev ES, Ayala FJ (2003) Pseudogenes: are they “junk” or functional DNA? Annu Rev Genet 37:123–151

    Article  CAS  PubMed  Google Scholar 

  25. Balakirev ES, Ayala FJ (2003) Pseudogenes are not junk DNA. In: Wasser SP (ed) Evolutionary theory and processes: modern horizons. Kluwer, The Netherlands, pp 177–193

    Google Scholar 

  26. Pink RC, Wicks K, Caley DP, Punch EK, Jacobs L, Carter DRF (2011) Pseudogenes: pseudo-functional or key regulators in health and disease? RNA 17:792–798

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  27. Tutar Y (2012) Pseudogenes. Comp Funct Genomics 2012:424526. doi:10.1155/2012 /424526

  28. Wen Y-Z, Zheng L-L, Qu L-H, Ayala FJ, Lun Z-R (2012) Pseudogenes are not pseudo any more. RNA Biol 9:27–32

    Article  PubMed  CAS  Google Scholar 

  29. Lewin B (2007) Genes IX. Oxford University Press, Oxford, NY

    Google Scholar 

  30. Lobzin VV, Chechetkin VR (2000) Order and correlations in genomic DNA sequences. The spectral approach. Physics–Uspekhi 43:55–78

    Article  CAS  Google Scholar 

  31. Trifonov EN (2011) Thirty years of multiple sequence codes. Genomics Proteomics Bioinformatics 9:1–6

    Article  CAS  PubMed  Google Scholar 

  32. Tiwari S, Ramachandran S, Bhattacharya A, Bhattacharya S, Ramaswamy R (1997) Prediction of probable genes by Fourier analysis of genomic sequences. Comput Appl Biosci 13:263–270

    CAS  PubMed  Google Scholar 

  33. Yin C, Yau SS-T (2007) Prediction of protein coding regions by the 3-base periodicity analysis of a DNA sequence. J Theor Biol 247:687–694

    Article  CAS  PubMed  Google Scholar 

  34. Holste D, Weiss O, Grosse I, Herzel H (2000) Are noncoding sequences of Rickettsia prowazekii remnants of “neutralized” genes? J Mol Evol 51:353–362

    CAS  PubMed  Google Scholar 

  35. Balakirev ES, Chechetkin VR, Lobzin VV, Ayala FJ (2003) DNA polymorphism in the β-esterase gene cluster of Drosophila melanogaster. Genetics 164:533–544

    CAS  PubMed Central  PubMed  Google Scholar 

  36. Vetsigian K, Goldenfeld N (2009) Genome rhetoric and the emergence of compositional bias. Proc Natl Acad Sci U S A 106:215–220

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  37. Li W (2011) On parameters of the human genome. J Theor Biol 288:92–104

    Article  CAS  PubMed  Google Scholar 

  38. Tillo D, Hughes TR (2009) G + C content dominates intrinsic nucleosome occupancy. BMC Bioinformatics 10:442

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  39. Mann S, Chen Y-PP (2010) Bacterial genomic G + C composition-eliciting environmental adaptation. Genomics 95:7–15

    Article  CAS  PubMed  Google Scholar 

  40. Dutta C, Paul S (2012) Microbial lifestyle and genome signatures. Curr Genomics 13:153–162

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  41. Wu H, Zhang Z, Hu S, Yu J (2012) On the molecular mechanism of GC content variation among eubacterial genomes. Biol Direct 7:2

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  42. Hildebrand H, Meyer A, Eyre-Walker A (2010) Evidence of selection upon genomic GC-content in bacteria. PLoS Genet 6:e1001107. doi:10.1371/journal.pgen.1001107

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  43. Raghavan R, Kelkar YD, Ochman H (2012) A selective force favoring increased G + C content in bacterial genes. Proc Natl Acad Sci U S A 109:14504–14507

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  44. Illingworth RS, Bird AP (2009) CpG islands: ‘A rough guide’. FEBS Lett 583:1713–1720

    Article  CAS  PubMed  Google Scholar 

  45. Bell CG, Wilson GA, Butcher LM, Roos C, Walter L, Beck S (2012) Human-specific CpG “beacons” identify loci associated with human-specific traits and disease. Epigenetics 7:1188–1199

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  46. Collet C, Nielsen KM, Russell RJ, Karl M, Oakeshott JG, Richmond RC (1990) Molecular analysis of duplicated esterase genes in Drosophila melanogaster. Mol Biol Evol 7:9–28

    CAS  PubMed  Google Scholar 

  47. Oakeshott JG, Collet C, Phillis R, Nielsen KM, Russell RJ, Chambers GK, Ross V, Richmond RC (1987) Molecular cloning and characterization of esterase 6, a serine hydrolase from Drosophila. Proc Natl Acad Sci U S A 84:3359–3363

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  48. Richmond RC, Nielsen KM, Brady JP, Snella EM (1990) Physiology, biochemistry and molecular biology of the Est-6 locus in Drosophila melanogaster. In: Barker JSF, Starmer WT, MacInture RJ (eds) Ecological and evolutionary genetics of Drosophila. Plenum, New York, pp 273–292

    Chapter  Google Scholar 

  49. Oakeshott JG, van Papenrecht EA, Boyce TM, Healy MJ, Russell RJ (1993) Evolutionary genetics of Drosophila esterases. Genetica 90:239–268

    Article  CAS  PubMed  Google Scholar 

  50. Oakeshott JG, Boyce TM, Russell RJ, Healy MJ (1995) Molecular insights into the evolution of an enzyme; esterase 6 in Drosophila. Trends Ecol Evol 10:103–110

    Article  CAS  PubMed  Google Scholar 

  51. Richmond RC, Gilbert DG, Sheehan KB, Gromko MH, Butterworth FM (1980) Esterase 6 and reproduction in Drosophila melanogaster. Science 207:1483–1485

    Article  CAS  PubMed  Google Scholar 

  52. Gromko MH, Gilbert DF, Richmond RC (1984) Sperm transfer and use in the multiple mating system of Drosophila. In: Smith RL (ed) Sperm competition and the evolution of animal mating systems. Academic, New York, pp –426

    Google Scholar 

  53. Dumancic MM, Oakeshott JG, Russell RJ, Healy MJ (1997) Characterization of the EstP protein in Drosophila melanogaster and its conservation in Drosophilids. Biochem Genet 35:251–271

    Article  CAS  PubMed  Google Scholar 

  54. Healy MJ, Dumancic MM, Oakeshott JG (1991) Biochemical and physiological studies of soluble esterases from Drosophila melanogaster. Biochem Genet 29:365–388

    Article  CAS  PubMed  Google Scholar 

  55. Balakirev ES, Ayala FJ (2003) Molecular population genetics of the β-esterase gene cluster of Drosophila melanogaster. J Genet 82:115–131

    Article  CAS  PubMed  Google Scholar 

  56. Balakirev ES, Ayala FJ (2004) The β-esterase gene cluster of Drosophila melanogaster: Is ψEst-6 a pseudogene, a functional gene, or both? Genetica 121:165–179

    Article  CAS  PubMed  Google Scholar 

  57. Yenikolopov GN, Malevantschuk OA, Peunova NI, Sergeev PV, Georgiev GP (1989) Est locus of Drosophila virilis contains two related genes. Dokl Acad Nauk SSSR 306:1247–1249 (in Russian)

    Google Scholar 

  58. Brady JP, Richmond RC, Oakeshott JG (1990) Cloning of the esterase-5 locus from Drosophila pseudoobscura and comparison with its homologue in D. melanogaster. Mol Biol Evol 7:525–546

    CAS  PubMed  Google Scholar 

  59. East PD, Graham A, Whitington G (1990) Molecular isolation and preliminary characterization of a duplicated esterase locus in Drosophila buzzatii. In: Barker JSF, Starmer WT, MacInture RJ (eds) Ecological and evolutionary genetics of Drosophila. Plenum, New York, pp 389–406

    Chapter  Google Scholar 

  60. King LM (1998) The role of gene conversion in determining sequence variation and divergence in the Est-5 gene family in Drosophila pseudoobscura. Genetics 148:305–315

    CAS  PubMed Central  PubMed  Google Scholar 

  61. Balakirev ES, Balakirev EI, Rodríguez-Trelles F, Ayala FJ (1999) Molecular evolution of two linked genes, Est-6 and Sod, in Drosophila melanogaster. Genetics 153:1357–1369

    CAS  PubMed Central  PubMed  Google Scholar 

  62. Balakirev ES, Balakirev EI, Ayala FJ (2002) Molecular evolution of the Est-6 gene in Drosophila melanogaster: contrasting patterns of DNA variability in adjacent functional regions. Gene 288:167–177

    Article  CAS  PubMed  Google Scholar 

  63. Balakirev ES, Anisimova M, Ayala FJ (2006) Positive and negative selection in the β-esterase gene cluster of the Drosophila melanogaster subgroup. J Mol Evol 62:496–510

    Google Scholar 

  64. Balakirev ES, Ayala FJ (2003) Nucleotide variation of the Est-6 gene region in natural populations of Drosophila melanogaster. Genetics 165:1901–1914

    CAS  PubMed Central  PubMed  Google Scholar 

  65. Balakirev ES, Chechetkin VR, Lobzin VV, Ayala FJ (2005) Entropy and GC content in the β-esterase gene cluster of Drosophila melanogaster subgroup. Mol Biol Evol 22:2063–2072

    Article  CAS  PubMed  Google Scholar 

  66. Thompson JD, Higgins DJ, Gibson TJ (1994) CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic Acids Res 22:4673–4680

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  67. Tamura K, Peterson D, Peterson N, Stecher G, Nei M, Kumar S (2011) MEGA5: molecular evolutionary genetics analysis using maximum likelihood, evolutionary distance, and maximum parsimony methods. Mol Biol Evol 28:2731–2739

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  68. Librado P, Rozas J (2009) DnaSP v5: a software for comprehensive analysis of DNA polymorphism data. Bioinformatics 25:1451–1452

    Article  CAS  PubMed  Google Scholar 

  69. Filatov DA (2002) PROSEQ: a software for preparation and evolutionary analysis of DNA sequence data sets. Mol Ecol Notes 2:621–624

    Article  CAS  Google Scholar 

  70. Chechetkin VR, Turygin AY (1994) On the spectral criteria of disorder in non-periodic sequences: application to inflation models, symbolic dynamics and DNA sequences. J Phys A Math Gen 27:4875–4898

    Article  CAS  Google Scholar 

  71. Chechetkin VR, Turygin AY (1995) Search of hidden periodicities in DNA sequences. J Theor Biol 175:477–494

    Article  CAS  PubMed  Google Scholar 

  72. Chechetkin VR, Lobzin VV (1998) Nucleosome units and hidden periodicities in DNA sequences. J Biomol Struct Dyn 15:937–947

    Article  CAS  PubMed  Google Scholar 

  73. Chechetkin VR, Lobzin VV (1996) Levels of ordering in coding and non-coding regions of DNA sequences. Phys Lett A 222:354–360

    Article  CAS  Google Scholar 

  74. Chechetkin VR (2011) Spectral sum rules and search for periodicities in DNA sequences. Phys Lett A 375:1729–1732

    Article  CAS  Google Scholar 

  75. Kravatskaya GI, Chechetkin VR, Kravatsky YV, Tumanyan VG (2013) Structural attributes of nucleotide sequences in promoter regions of supercoiling-sensitive genes: how to relate microarray expression data with genomic sequences. Genomics 101(1):1–13. doi:10.1016/j.ygeno.2012.10.003, http://dx.doi.org

    Article  CAS  PubMed  Google Scholar 

  76. Lemeunier F, David JR, Tsacas L, Ashburner M (1986) The melanogaster species group. In: Ashburner M, Carson HL, Thompson JN Jr (eds) The genetics and biology of Drosophila, vol 3e. Academic, London, pp 147–256

    Google Scholar 

  77. Cariou M-L (1987) Biochemical phylogeny of the eight species in the Drosophila melanogaster subgroup, including D. sechellia and D. orena. Genet Res 50:181–185

    Article  CAS  PubMed  Google Scholar 

  78. Lachaise D, Cariou M-L, David JR, Lemeunier F, Tsacas L, Ashburner M (1988) Biogeography of the Drosophila melanogaster species subgroup. Evol Biol 22:159–225

    Google Scholar 

  79. Lachaise D, Harry M, Solignac M, Lemeunier F, Benassi V, Cariou M-L (2000) Evolutionary novelties in islands: Drosophila santomea, a new melanogaster sister species from Sao Tome. Proc R Soc Biol Sci 267:1487–1495

    Article  CAS  Google Scholar 

  80. Ko W-Y, David RM, Akashi H (2003) Molecular phylogeny of the Drosophila melanogaster species subgroup. J Mol Evol 57:562–573

    Article  CAS  PubMed  Google Scholar 

  81. da Lage JL, Kergoat GJ, Maczkowiak F, Silvain JF, Cariou ML, Lachaise D (2007) A phylogeny of Drosophilidae using the Amyrel gene: questioning the Drosophila melanogaster species group boundaries. J Zool Syst Evol Res 45:47–63

    Article  Google Scholar 

  82. Drosophila 12 genomes consortium (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218

    Article  CAS  Google Scholar 

  83. Obbard DJ, Maclennan J, Kim K-W, Rambaut A, O’Grady PM, Jiggins FM (2012) Estimating divergence dates and substitution rates in the Drosophila phylogeny. Mol Biol Evol 29:3459–3473

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  84. Yang Y, Hou Z-C, Qian Y-H, Kang H, Zeng Q-T (2012) Increasing the data size to accurately reconstruct the phylogenetic relationships between nine subgroups of the Drosophila melanogaster species group (Drosophilidae, Diptera). Mol Phylogenet Evol 62:214–223

    Article  PubMed  Google Scholar 

  85. Johnson NL, Leone FC (1977) Statistics and experimental design in engineering and the physical sciences, vol II. John Wiley, New York, Ch. 13

    Google Scholar 

  86. Shields DC, Sharp PM, Higgins DJ, Wright F (1988) “Silent” sites in Drosophila genes are not neutral: evidence of selection among synonymous codons. Mol Biol Evol 5:704–716

    CAS  PubMed  Google Scholar 

  87. Wright F (1990) The “effective number of codons” used in a gene. Gene 87:23–29

    Article  CAS  PubMed  Google Scholar 

  88. Morton BR (1993) Chloroplast DNA codon usage: evidence for selection at the psb A locus based on tRNA availability. J Mol Evol 37:273–280

    Article  CAS  PubMed  Google Scholar 

  89. Bulmer M (1991) The selection-mutation-drift theory of synonymous codon usage. Genetics 129:897–907

    CAS  PubMed Central  PubMed  Google Scholar 

  90. Moriyama EN, Hartl DL (1993) Codon usage bias and base composition of nuclear genes in Drosophila. Genetics 134:847–858

    CAS  PubMed Central  PubMed  Google Scholar 

  91. Heger A, Ponting CP (2007) Variable strength of translational selection among 12 Drosophila species. Genetics 177:1337–1348

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  92. Vicario S, Moriyama EN, Powell JR (2007) Codon usage in twelve species of Drosophila. BMC Evol Biol 7:226

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  93. de Procé SM, Zeng K, Betancourt AJ, Charlesworth B (2012) Selection on codon usage and base composition in Drosophila americana. Biol Lett 8:82–85

    Article  PubMed Central  PubMed  CAS  Google Scholar 

  94. Starmer WT, Sullivan DT (1989) A shift in the third-codon-position nucleotide frequency in alcohol dehydrogenase genes in the genus Drosophila. Mol Biol Evol 6:546–552

    CAS  PubMed  Google Scholar 

  95. Moriyama EN, Gojobori T (1992) Rates of synonymous substitutions and base composition of nuclear genes in Drosophila. Genetics 130:855–864

    CAS  PubMed Central  PubMed  Google Scholar 

  96. Currie PD, Sullivan DT (1994) Structure, expression and duplication of genes which encode phosphoglyceromutase of Drosophila melanogaster. Genetics 138:353–363

    CAS  PubMed Central  Google Scholar 

  97. Sullivan DT, Starmer WT, Curtiss SW, Menotti-Raymond M, Yum J (1994) Unusual molecular evolution of an Adh pseudogene in Drosophila. Mol Biol Evol 11:443–458

    CAS  PubMed  Google Scholar 

  98. Ramos-Onsins S, Aguadé M (1998) Molecular evolution of the Cecropin multigene family in Drosophila: functional genes vs. pseudogenes. Genetics 150:157–171

    CAS  PubMed Central  PubMed  Google Scholar 

  99. Echols N, Harrison P, Balasubramanian S, Luscombe NM, Bertone P, Zhang Z, Gerstein M (2002) Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes. Nucleic Acids Res 30:2515–2523

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  100. Kliman RM, Hey H (1994) The effects of mutation and natural selections on codon bias in the genes of Drosophila. Genetics 137:1049–1056

    CAS  PubMed Central  PubMed  Google Scholar 

  101. Epstein RJ, Lin K, Tan TW (2000) A functional significance for codon third bases. Gene 245:291–298

    Article  CAS  PubMed  Google Scholar 

  102. Lin K, Tan SB, Kolatkar PR, Epstein RJ (2003) Nonrandom intragenic variations in patterns of codon bias implicate a sequential interplay between transitional genetic drift and functional amino acid selection. J Mol Evol 57:538–545

    Article  CAS  PubMed  Google Scholar 

  103. Duret L, Mouchiroud D (1999) Expression pattern and, surprisingly, gene length shape codon usage in Caenorhabditis, Drosophila, and Arabidopsis. Proc Natl Acad Sci U S A 96:4482–4487

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  104. Duret L, Hurst LD (2001) The elevated GC content at exonic third sites is not evidence against neutralist models of isochore evolution. Mol Biol Evol 18:757–762

    Article  CAS  PubMed  Google Scholar 

  105. Gojobori T, Li W-H, Graur D (1982) Patterns of nucleotide substitution in pseudogenes and functional genes. J Mol Evol 18:360–369

    Article  CAS  PubMed  Google Scholar 

  106. Li W-H, Wu C-I, Luo C-C (1984) Nonrandomness of point mutation as reflected in nucleotide substitutions in pseudogenes and its evolutionary implications. J Mol Evol 21:58–71

    Article  CAS  PubMed  Google Scholar 

  107. Alvarez-Valin F, Lamolle G, Bernardi G (2002) Isochores, GC3 and mutation biases in the human genome. Gene 300:161–168

    Article  CAS  PubMed  Google Scholar 

  108. Zhang Z, Gerstein M (2003) Patterns of nucleotide substitutions, insertion and deletion in the human genome inferred from pseudogenes. Nucleic Acids Res 31:5338–5348

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  109. Brosius J, Gould SJ (1992) On “genomenclature”: a comprehensive (and respectful) taxonomy for pseudogenes and other “junk DNA”. Proc Natl Acad Sci U S A 89:10706–10710

    Article  CAS  PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgment

We are grateful to Elena Balakireva for encouragement and help.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Evgeniy S. Balakirev Ph.D. .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Balakirev, E.S., Chechetkin, V.R., Lobzin, V.V., Ayala, F.J. (2014). Computational Methods of Identification of Pseudogenes Based on Functionality: Entropy and GC Content. In: Poliseno, L. (eds) Pseudogenes. Methods in Molecular Biology, vol 1167. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0835-6_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0835-6_4

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0834-9

  • Online ISBN: 978-1-4939-0835-6

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics