Abstract
Nucleotide and protein sequences are the foundation for all bioinformatics tools and resources. Researchers can analyze these sequences to discover genes or predict the function of their products. The INSDC (International Nucleotide Sequence Database—DDBJ/ENA/GenBank + SRA) is an international, centralized primary sequence resource that is freely available on the Internet. This database contains all publicly available nucleotide and derived protein sequences. This chapter discusses the structure and history of the nucleotide sequence database resources built at NCBI, provides information on how to submit sequences to the databases, and explains how to access the sequence data.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Karsch-Mizrachi I, Nakamura Y, Cochrane G (2012) The International Nucleotide Sequence Database Collaboration. Nucleic Acids Res 40(Database issue):D33–D37
Benson DA, Clark K, Karsch-Mizrachi I, Lipman DJ, Ostell J, Sayers EW (2015) GenBank. Nucleic Acids Res 43(Database issue):D30–D35
Silvester N, Alako B, Amid C, Cerdeno-Tarraga A, Cleland I, Gibson R et al (2015) Content discovery and retrieval services at the European Nucleotide Archive. Nucleic Acids Res 43(Database issue):D23–D29
Kodama Y, Mashima J, Kosuge T, Katayama T, Fujisawa T, Kaminuma E et al (2015) The DDBJ Japanese Genotype-phenotype Archive for genetic and phenotypic human data. Nucleic Acids Res 43(Database issue):D18–D22
Mellmann A, Harmsen D, Cummings CA, Zentz EB, Leopold SR, Rico A et al (2011) Prospective genomic characterization of the German enterohemorrhagic Escherichia coli O104:H4 outbreak by rapid next generation sequencing technology. PLoS One 6(7):e22751
Barrett T, Wilhite SE, Ledoux P, Evangelista C, Kim IF, Tomashevsky M et al (2013) NCBI GEO: archive for functional genomics data sets—update. Nucleic Acids Res 41(Database issue):D991–D995
Fleischmann RD, Adams MD, White O, Clayton RA, Kirkness EF, Kerlavage AR et al (1995) Whole-genome random sequencing and assembly of Haemophilus influenzae Rd. Science 269(5223):496–512
Tatusova T, DiCuccio M, Badretdin A, Chetvernin V, Ciufo S, Li W (2013) Prokaryotic genome annotation pipeline. In: The NCBI Handbook, 2nd edn. [Internet], Bethesda, MD. Available from: http://www.ncbi.nlm.nih.gov/books/NBK174280/
Acknowledgement
This research was supported by the Intramural Research Program of the NIH, NLM, NCBI.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media New York
About this protocol
Cite this protocol
O’Sullivan, C., Busby, B., Mizrachi, I.K. (2017). Managing Sequence Data. In: Keith, J. (eds) Bioinformatics. Methods in Molecular Biology, vol 1525. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-6622-6_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-6622-6_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-6620-2
Online ISBN: 978-1-4939-6622-6
eBook Packages: Springer Protocols