Skip to main content

Production and Analytic Bioinformatics for Next-Generation DNA Sequencing

  • Protocol
  • First Online:
Clinical Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1168))

Abstract

The bioinformatics requirements within the clinical environment are very specific, and analytic techniques need to be fit for purpose, robust, and predictable. At the same time, the bewildering amount of information produced during these analyses needs to be carefully managed, used and interpreted correctly. The challenge for clinical laboratories now is to implement production analytical processes that are capable of handling different experimental approaches on current equipment, as well as to incorporate ways for these systems to evolve to take account of developments likely to make impacts in the near future. This is complicated by the many options available at each of the critical processing steps and a clear method needs to be developed to assemble appropriate pipelines. Here, I discuss the issues relevant to the development of an informatics pipeline that meets these criteria that should allow individual laboratories to assess their proposed strategies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

BAM:

Binary version of SAM file

CNV:

Copy number variant

NGS:

Next-generation sequencing

Q:

Quality score

QC:

Quality control

SNV:

Single nucleotide variant

SSV:

Variant instance

VCF:

Variant call format

WES:

Whole-exome sequencing

WGS:

Whole-genome sequencing

References

  1. Metzker ML (2010) Sequencing technologies: the next generation. Nat Rev Genet 11:31–46

    Article  CAS  PubMed  Google Scholar 

  2. Liu L, Li Y, Li S et al (2012) Comparison of next-generation sequencing systems. J Biomed Biotechnol 2012:251364

    PubMed Central  PubMed  Google Scholar 

  3. Kamalakaran S, Varadan V, Janevski A et al (2013) Translating next generation sequencing to practice: opportunities and necessary steps. Mol Oncol 7:743–755

    Article  CAS  PubMed  Google Scholar 

  4. Hong H, Zhang W, Shen J et al (2013) Critical role of bioinformatics in translating huge amounts of next-generation sequencing data in personalized medicine. Sci China Life Sci 56:110–118

    Article  CAS  PubMed  Google Scholar 

  5. Yang Y, Muzny DM, Reid JG et al (2013) Clinical whole-exome sequencing for the diagnosis of Mendelian disorders. N Engl J Med 369:1502–1511

    Article  CAS  PubMed  Google Scholar 

  6. Bromberg Y (2013) Building a genome analysis pipeline to predict disease risk and prevent disease. J Mol Biol 425:3993–4005

    Article  CAS  PubMed  Google Scholar 

  7. Guo Y, Ye F, Sheng Q et al (2013) Three-stage quality control strategies for DNA re-sequencing data. Brief Bioinform. doi:10.1093/bib/bbt069

  8. Cock PJA, Fields CJ, Goto N et al (2010) The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res 38:1767–1771

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  9. Li H, Handsaker B, Wysoker A et al (2009) The sequencer alignment/map format and SAMtools. Bioinformatics 16:2078–2079

    Article  Google Scholar 

  10. Danecek P, Auton A, Abecasis G et al (2011) The variant call format and VCFtools. Bioinformatics 27:2156–2158

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  11. Quinlan AR, Hall IM (2010) BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics 26:841–842

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  12. 1000 Genomes Project Consortium (2010) A map of human genome variation from population-scale sequencing. Nature 467:1061–1073

    Article  Google Scholar 

  13. Ng PC, Henikoff S (2003) SIFT: predicting amino acid changes that affect protein function. Nucleic Acids Res 31:3812–3814

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  14. Adzhubei IA, Schmidt S, Peshkin L et al (2010) A method and server for predicting damaging missense mutations. Nat Methods 7:248–249

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  15. Schwarz JM, Rodelsperger C, Schuelke M et al (2010) MutationTaster evaluates disease-causing potential of sequence alternations. Nat Methods 7:575–576

    Article  CAS  PubMed  Google Scholar 

  16. Quinque D, Kittler R, Kayser M et al (2006) Evaluation of saliva and a source of human DNA for population and association studies. Anal Biochem 353:272–277

    Article  CAS  PubMed  Google Scholar 

  17. Boland JF, Chung CC, Roberson D et al (2013) The new sequencer on the block: comparison of Life Technology’s Proton sequencer to an Illumina HiSeq for whole-exome sequencing. Hum Genet 132:1153–1163

    Article  CAS  PubMed  Google Scholar 

  18. Pavlopoulos GA, Oulas A, Iacucci E et al (2013) Unravelling genomic variation from next generation sequencing data. BioData Min 6:13–38

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  19. Liu X, Han S, Wang Z et al (2013) Variant callers for next generation sequencing data: a comparison study. PLoS One 8:e75619

    Article  CAS  PubMed Central  PubMed  Google Scholar 

  20. Wang K, Li M, Hakonarson H (2010) ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res 38:e164

    Article  PubMed Central  PubMed  Google Scholar 

  21. Chang X, Wang K (2012) wANNOVAR: annotating genetic variants for personal genomes via the web. J Med Genet 49:433–436

    Article  PubMed Central  PubMed  Google Scholar 

Download references

Acknowledgments

I would like to acknowledge the long-standing support of Mr Neill Hodgen and the Department of Clinical Immunology, Royal Perth Hospital for their past and ongoing support.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Richard James Nigel Allcock .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer Science+Business Media New York

About this protocol

Cite this protocol

Allcock, R.J.N. (2014). Production and Analytic Bioinformatics for Next-Generation DNA Sequencing. In: Trent, R. (eds) Clinical Bioinformatics. Methods in Molecular Biology, vol 1168. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-0847-9_2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-0847-9_2

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-0846-2

  • Online ISBN: 978-1-4939-0847-9

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics