Abstract
Sequencing by synthesis is the underlying technology for many next- generation DNA sequencing platforms. We developed a new model, the fixed flow cycle model, to derive the distributions of sequence length for a given number of flow cycles under the general conditions where the nucleotide incorporation is probabilistic and may be incomplete, as in some single-molecule sequencing technologies. Unlike the previous model, the new model yields the probability distribution for the sequence length. Explicit closed form formulas are derived for the mean and variance of the distribution.
Similar content being viewed by others
References
Harris TD, Buzby PR, Babcock H, Beer E, Bowers J, Braslavsky I, Causey M, Colonell J, Dimeo J, Efcavitch WJ, Giladi E, Gill J, Healy J, Jarosz M, Lapen D, Moulton K, Quake SR, Steinmann K, Thayer E, Tyurina A, Ward R, Weiss H, Xie Z (2008) Single-molecule dna sequencing of a viral genome. Science 320(5872): 106–109. doi:10.1126/science.1150427
Kong Y (2009a) Statistical distributions of pyrosequencing. J Comput Biol 16: 31–42. doi:10.1089/cmb.2008.0106
Kong Y (2009b) Statistical distributions of sequencing by synthesis with probabilistic nucleotide incorporation. J Comput Biol 16: 817–827. doi:10.1089/cmb.2008.0215
Wilf HS (2006) Generatingfunctionology. A. K. Peters Ltd., Natick
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Kong, Y. Length distribution of sequencing by synthesis: fixed flow cycle model. J. Math. Biol. 67, 389–410 (2013). https://doi.org/10.1007/s00285-012-0556-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00285-012-0556-3
Keywords
- Sequencing by synthesis
- Next-generation sequencing
- Sequence analysis
- Generating function
- Probability
- Combinattorics