Abstract
The human microbiome, which includes the collective microbes residing in or on the human body, has a profound influence on the human health. DNA sequencing technology has made the large-scale human microbiome studies possible by using shotgun metagenomic sequencing. One important aspect of data analysis of such metagenomic data is to quantify the bacterial abundances based on the metagenomic sequencing data. Existing methods almost always quantify such abundances one sample at a time, which ignore certain systematic differences in read coverage along the genomes due to GC contents, copy number variation and the bacterial origin of replication. In order to account for such differences in read counts, we propose a multi-sample Poisson model to quantify microbial abundances based on read counts that are assigned to species-specific taxonomic markers. Our model takes into account the marker-specific effects when normalizing the sequencing count data in order to obtain more accurate quantification of the species abundances. Compared to currently available methods on simulated data and real data sets, our method has demonstrated an improved accuracy in bacterial abundance quantification, which leads to more biologically interesting results from downstream data analysis.
Similar content being viewed by others
References
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ (1990) Basic local alignment search tool. J Mol Biol 215:403–410
Cho I, Blaser MJ (2012) The human microbiome: at the interface of health and disease. Nat Rev Genet 13:260–270
Du Z, Hudcovic T, Mrazek J, Kozakova H, Srutkova D, Schwarzer M, Tlaskalova-Hogenova H, Kostovcik M, Kverka M (2015) Development of gut inflammation in mice colonized with mucosa-associated bacteria from patients with ulcerative colitis. Gut Pathog 7:1
Gevers D, Kugathasan S, Denson LA, Vázquez-Baeza Y, Van Treuren W, Ren B, Schwager E, Knights D, Song SJ, Yassour M et al (2014) The treatment-naive microbiome in new-onset Crohns disease. Cell Host Microbe 15:382–392
Korem T, Zeevi D, Suez J, Weinberger A, Avnit-Sagi T, Pompan-Lotan M, Matot E, Jona G, Harmelin A, Cohen N, Sirota-Madi A, Thaiss CA, Pevsner-Fischer M, Sorek R, Xavier R, Elinav E, Segal E (2015) Growth dynamics of gut microbiota in health and disease inferred from single metagenomic samples. Science 349:1101–1106
Kostic AD, Gevers D, Siljander H, Vatanen T, Hyötyläinen T, Hämäläinen A-M, Peet A, Tillmann V, Pöhö P, Mattila I, Lähdesmäki H, Franzosa EA, Vaarala O, de Goffau M, Harmsen H, Ilonen J, Virtanen SM, Clish CB, Orešič M, Huttenhower C, Knip M, Xavier RJ (2015) The dynamics of the human infant gut microbiome in development and in progression toward type 1 diabetes. Cell Host Microbe 17:260–273
Langmead B, Trapnell C, Pop M, Salzberg SL et al (2009) Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol 10:R25
Lee D, Baldassano R, Otley A, Albenberg L, Griffiths A, Compher C, Chen E, Li H, Gilroy E, Nessel L et al (2015) Comparative effectiveness of nutritional and biological therapy in North American children with active Crohn’s disease. Inflamm Bowel Dis 21:1786–1793
Lewis JD, Chen EZ, Baldassano RN, Otley AR, Griffiths AM, Lee D, Bittinger K, Bailey A, Friedman ES, Hoffmann C et al (2015) Inflammation, antibiotics, and diet as environmental stressors of the gut microbiome in pediatric Crohns disease. Cell Host Microbe 18:489–500
Liu Y, van Kruiningen H, West A, Cartun R, Cortot A, Colombel J (1995) Immunocytochemical evidence of Listeria, Escherichia coli, and Streptococcus antigens in Crohn’s disease. Gastroenterology 108(5):1396–1404
Manichanh C, Borruel N, Casellas F, Guarner F (2012) The gut microbiota in IBD. Nat Rev Gastroenterol Hepatol 9:599–608
Qin J, Li R, Raes J, Arumugam M, Burgdorf KS, Manichanh C, Nielsen T, Pons N, Levenez F, Yamada T et al (2010) A human gut microbial gene catalogue established by metagenomic sequencing. Nature 464:59–65
Romero R, Hassan SS, Gajer P, Tarca AL, Fadrosh DW, Nikita L, Galuppi M, Lamont RF, Chaemsaithong P, Miranda J, Chaiworapongsa T, Ravel J (2014) The composition and stability of the vaginal microbiota of normal pregnant women is different From that of non-pregnant women. Microbiome 2:4
Ross MG, Russ C, Costello M, Hollinger A, Lennon NJ, Hegarty R, Nusbaum C, Jaffe DB (2013) Characterizing and measuring bias in sequence data. Genome Biol 14:R51
Sartor R (2006) Mechanisms of disease: pathogenesis of Crohn’s disease and ulcerative colitis. Nat Clin Pract Gastroenterol Hepatol 3:390–407
Segata N, Waldron L, Ballarini A, Narasimhan V, Jousson O, Huttenhower C (2012) Metagenomic microbial community profiling using unique clade-specific marker genes. Nat Methods 9:811–814
Stein RR, Bucci V, Toussaint NC, Buffie CG, Rätsch G, Pamer EG, Sander C, Xavier JB (2013) Ecological modeling from time-series inference: insight into dynamics and stability of intestinal microbiota. PLoS Comput Biol 9:e1003388
Sunagawa S, Mende DR, Zeller G, Izquierdo-Carrasco F, Berger SA, Kultima JR, Coelho LP, Arumugam M, Tap J, Nielsen HB et al (2013) Metagenomic species profiling using universal phylogenetic marker genes. Nat Methods 10:1196–1199
Turnbaugh PJ, Ley RE, Hamady M, Fraser-Liggett CM, Knight R, Gordon JI (2007) The human microbiome project. Nature 449:804–810
Van den Abbeele P, Belzer C, Goossens M, Kleerebezem M, De Vos WM, Thas O, De Weirdt R, Kerckhof F-M, Van de Wiele T (2013) Butyrate-producing Clostridium cluster XIVa species specifically colonize mucins in an in vitro gut model. ISME J 7:949–961
Acknowledgments
Supported by NIH Grants CA127334 and GM097505.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Chen, E.Z., Bushman, F.D. & Li, H. A Model-Based Approach for Species Abundance Quantification Based on Shotgun Metagenomic Data. Stat Biosci 9, 13–27 (2017). https://doi.org/10.1007/s12561-016-9148-x
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12561-016-9148-x