Skip to main content

Comparative Genomics in Drosophila

  • Protocol
  • First Online:
Comparative Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1704))

Abstract

Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.

In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered “junk” DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding sites, that due to the high degree of sequence variability usually impose increased challenges for bioinformatics approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Adams MD, Celniker SE, Holt RA et al (2000) The genome sequence of Drosophila melanogaster. Science 287:2185–2195

    Article  PubMed  Google Scholar 

  2. Misra S, Crosby MA, Mungall CJ et al (2002) Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol 3(12):research0083.1–research083.22

    Article  Google Scholar 

  3. Richards S, Liu Y, Bettencourt BR et al (2005) Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res 15:1–18

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Bergman CM, Pfeiffer BD, Rincón-Limas DE et al (2002) Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome. Genome Biol 3:RESEARCH0086

    Article  PubMed  PubMed Central  Google Scholar 

  5. Kellis M, Patterson N, Endrizzi M et al (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423:241–254

    Article  CAS  PubMed  Google Scholar 

  6. Clark AG, Eisen MB, Smith DR et al (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218

    Article  PubMed  Google Scholar 

  7. Stark A, Lin MF, Kheradpour P et al (2007) Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 450:219–232

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Lin MF, Carlson JW, Crosby MA et al (2007) Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. Genome Res 17:1823–1836

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Roy S, Ernst J, modENCODE Consortium et al (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330:1787–1797

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Nègre N, Brown CD, Ma L et al (2011) A cis-regulatory map of the Drosophila genome. Nature 471:527–531

    Article  PubMed  PubMed Central  Google Scholar 

  11. Attrill H, Falls K, Goodman JL et al (2016) FlyBase: establishing a gene group resource for Drosophila melanogaster. Nucleic Acids Res 44:D786–D792

    Article  CAS  PubMed  Google Scholar 

  12. Herrero J, Muffato M, Beal K et al (2016) Ensembl comparative genomics resources. Database 2016:bav096. https://doi.org/10.1093/database/baw053

    Article  PubMed  PubMed Central  Google Scholar 

  13. Speir ML, Zweig AS, Rosenbloom KR et al (2016) The UCSC genome browser database: 2016 update. Nucleic Acids Res 44:D717–D725

    Article  CAS  PubMed  Google Scholar 

  14. Harris RS (2007) Improved pairwise alignment of genomic DNA. Pennsylvania State University, State College, PA

    Google Scholar 

  15. Blanchette M, Kent WJ, Riemer C et al (2004) Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14:708–715

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Felsenstein J, Churchill GA (1996) A hidden markov model approach to variation among sites in rate of evolution. Mol Biol Evol 13:93–104

    Article  CAS  PubMed  Google Scholar 

  17. Siepel A, Bejerano G, Pedersen JS et al (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034–1050

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Li R, Ye J, Li S et al (2005) ReAS: recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol 1:e43

    Article  PubMed  PubMed Central  Google Scholar 

  19. Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21(Suppl 1):i152–i158

    Article  CAS  PubMed  Google Scholar 

  20. Tempel S (2012) Using and understanding RepeatMasker. Methods Mol Biol 859:29–51

    Article  CAS  PubMed  Google Scholar 

  21. Smith CD, Edgar RC, Yandell MD et al (2007) Improved repeat identification and masking in dipterans. Gene 389:1–9

    Article  CAS  PubMed  Google Scholar 

  22. Stanke M, Waack S (2003) Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19(Suppl 2):ii215–ii225

    Article  PubMed  Google Scholar 

  23. Gross SS, Brent MR (2006) Using multiple alignments to improve gene prediction. J Comput Biol 13:379–393

    Article  CAS  PubMed  Google Scholar 

  24. Gross SS, Do CB, Sirota M, Batzoglou S (2007) CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol 8:R269

    Article  PubMed  PubMed Central  Google Scholar 

  25. Lin MF, Jungreis I, Kellis M (2011) PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27:i275–i282

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Kristensen DM, Wolf YI, Mushegian AR, Koonin EV (2011) Computational methods for gene Orthology inference. Brief Bioinform 12:379–391

    Article  PubMed  PubMed Central  Google Scholar 

  27. Vilella AJ, Severin J, Ureta-Vidal A et al (2009) EnsemblCompara genetrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19:327–335

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Huerta-Cepas J, Capella-Gutiérrez S, Pryszcz LP et al (2014) PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome. Nucleic Acids Res 42:D897–D902

    Article  CAS  PubMed  Google Scholar 

  29. Li L, Stoeckert CJ Jr, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218

    Article  CAS  PubMed  Google Scholar 

  31. Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591

    Article  CAS  PubMed  Google Scholar 

  32. Pedersen JS, Bejerano G, Siepel A et al (2006) Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol 2:e33

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Lorenz R, Bernhart SH, Höner Zu Siederdissen C et al (2011) ViennaRNA package 2.0. Algorithms Mol Biol 6:26

    Article  PubMed  PubMed Central  Google Scholar 

  34. Lai EC, Tomancak P, Williams RW, Rubin GM (2003) Computational identification of drosophila microRNA genes. Genome Biol 4:R42

    Article  PubMed  PubMed Central  Google Scholar 

  35. Lim LP, Lau NC, Weinstein EG et al (2003) The microRNAs of Caenorhabditis elegans. Genes Dev 17:991–1008

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Blanchette M, Tompa M (2002) Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 12:739–748

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Zhang Z, Gerstein M (2003) Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements. J Biol 2:11

    Article  PubMed  PubMed Central  Google Scholar 

  38. Ganley ARD, Kobayashi T (2007) Phylogenetic footprinting to find functional DNA elements. Methods Mol Biol 395:367–380

    Article  CAS  PubMed  Google Scholar 

  39. Satija R, Novák A, Miklós I et al (2009) BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC. BMC Evol Biol 9:217

    Article  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael Sammeth .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Science+Business Media LLC

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Oti, M., Pane, A., Sammeth, M. (2018). Comparative Genomics in Drosophila . In: Setubal, J., Stoye, J., Stadler, P. (eds) Comparative Genomics. Methods in Molecular Biology, vol 1704. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7463-4_17

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-7463-4_17

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-7461-0

  • Online ISBN: 978-1-4939-7463-4

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics