Skip to main content

Normalization of Single-Cell RNA-Seq Data

  • Protocol
  • First Online:
RNA Bioinformatics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 2284))

Abstract

Normalization is an important step in the analysis of single-cell RNA-seq data. While no single method outperforms all others in all datasets, the choice of normalization can have profound impact on the results. Data-driven metrics can be used to rank normalization methods and select the best performers. Here, we show how to use R/Bioconductor to calculate normalization factors, apply them to compute normalized data, and compare several normalization approaches. Finally, we briefly show how to perform downstream analysis steps on the normalized data.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 139.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 179.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bullard JH, Purdom E, Hansen KD, Dudoit S (2010) Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinformatics 11:94. https://doi.org/10.1186/1471-2105-11-94

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Vallejos CA, Risso D, Scialdone A, Dudoit S, Marioni JC (2017) Normalizing single-cell RNA sequencing data: challenges and opportunities. Nat Methods 14(6):565–571. https://doi.org/10.1038/nmeth.4292

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Vieth B, Parekh S, Ziegenhain C, Enard W, Hellmann I (2019) A systematic evaluation of single cell RNA-seq analysis pipelines. Nat Commun 10(1):4667. https://doi.org/10.1038/s41467-019-12266-7

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  4. Lun AT, Bach K, Marioni JC (2016) Pooling across cells to normalize single-cell RNA sequencing data with many zero counts. Genome Biol 17:75. https://doi.org/10.1186/s13059-016-0947-7

    Article  CAS  PubMed  Google Scholar 

  5. Qiu X, Hill A, Packer J, Lin D, Ma YA, Trapnell C (2017) Single-cell mRNA quantification and differential analysis with Census. Nat Methods 14(3):309–315. https://doi.org/10.1038/nmeth.4150

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  6. Bacher R, Chu LF, Leng N, Gasch AP, Thomson JA, Stewart RM, Newton M, Kendziorski C (2017) SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 14(6):584–586. https://doi.org/10.1038/nmeth.4263

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  7. Townes FW, Irizarry RA (2020) Quantile normalization of single-cell RNA-seq read counts without unique molecular identifiers. Genome Biol 21:160 https://doi.org/10.1186/s13059-020-02078-0

  8. Vallejos CA, Marioni JC, Richardson S (2015) BASiCS: Bayesian analysis of single-cell sequencing data. PLoS Comput Biol 11(6):e1004333. https://doi.org/10.1371/journal.pcbi.1004333

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Risso D, Perraudeau F, Gribkova S, Dudoit S, Vert JP (2018) A general and flexible method for signal extraction from single-cell RNA-seq data. Nat Commun 9(1):284. https://doi.org/10.1038/s41467-017-02554-5

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Townes FW, Hicks SC, Aryee MJ, Irizarry RA (2019) Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model. Genome Biol 20(1):295. https://doi.org/10.1186/s13059-019-1861-6

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  11. Huber W, Carey VJ, Gentleman R, Anders S, Carlson M, Carvalho BS, Bravo HC, Davis S, Gatto L, Girke T, Gottardo R, Hahne F, Hansen KD, Irizarry RA, Lawrence M, Love MI, MacDonald J, Obenchain V, Oleś AK, Pagès H, Reyes A, Shannon P, Smyth GK, Tenenbaum D, Waldron L, Morgan M (2015) Orchestrating high-throughput genomic analysis with Bioconductor. Nat Methods 12(2):115–121. https://doi.org/10.1038/nmeth.3252

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  12. Amezquita RA, Lun ATL, Becht E, Carey VJ, Carpp LN, Geistlinger L, Marini F, Rue-Albrecht K, Risso D, Soneson C, Waldron L, Pagès H, Smith ML, Huber W, Morgan M, Gottardo R, Hicks SC (2020) Orchestrating single-cell analysis with Bioconductor. Nat Methods 17(2):137–145. https://doi.org/10.1038/s41592-019-0654-x

    Article  CAS  PubMed  Google Scholar 

  13. Lun ATL, Pagès H, Smith ML (2018) beachmat: a Bioconductor C++ API for accessing high-throughput biological data from a variety of R matrix types. PLoS Comput Biol 14(5):e1006135. https://doi.org/10.1371/journal.pcbi.1006135

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  14. Tasic B, Menon V, Nguyen TN, Kim TK, Jarsky T, Yao Z, Levi B, Gray LT, Sorensen SA, Dolbeare T, Bertagnolli D, Goldy J, Shapovalova N, Parry S, Lee C, Smith K, Bernard A, Madisen L, Sunkin SM, Hawrylycz M, Koch C, Zeng H (2016) Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat Neurosci 19(2):335–346. https://doi.org/10.1038/nn.4216

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Svensson V (2020) Droplet scRNA-seq is not zero-inflated. Nat Biotechnol 38(2):147–150. https://doi.org/10.1038/s41587-019-0379-5

    Article  CAS  PubMed  Google Scholar 

  16. Stoeckius M, Hafemeister C, Stephenson W, Houck-Loomis B, Chattopadhyay PK, Swerdlow H, Satija R, Smibert P (2017) Simultaneous epitope and transcriptome measurement in single cells. Nat Methods 14(9):865–868. https://doi.org/10.1038/nmeth.4380

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. McCarthy DJ, Campbell KR, Lun AT, Wills QF (2017) Scater: pre-processing, quality control, normalization and visualization of single-cell RNA-seq data in R. Bioinformatics 33(8):1179–1186. https://doi.org/10.1093/bioinformatics/btw777

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Cole MB, Risso D, Wagner A, DeTomaso D, Ngai J, Purdom E, Dudoit S, Yosef N (2019) Performance assessment and selection of normalization procedures for single-cell RNA-Seq. Cell Syst 8(4):315–328.e318. https://doi.org/10.1016/j.cels.2019.03.010

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Jiang L, Schlesinger F, Davis CA, Zhang Y, Li R, Salit M, Gingeras TR, Oliver B (2011) Synthetic spike-in standards for RNA-seq experiments. Genome Res 21(9):1543–1551. https://doi.org/10.1101/gr.121095.111

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Robinson MD, McCarthy DJ, Smyth GK (2010) edgeR: a Bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics 26(1):139–140. https://doi.org/10.1093/bioinformatics/btp616

    Article  CAS  PubMed  Google Scholar 

  21. Risso D, Ngai J, Speed TP, Dudoit S (2014) Normalization of RNA-seq data using factor analysis of control genes or samples. Nat Biotechnol 32(9):896–902. https://doi.org/10.1038/nbt.2931

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  22. Robinson MD, Oshlack A (2010) A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biol 11(3):R25. https://doi.org/10.1186/gb-2010-11-3-r25

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  23. Anders S, Huber W (2010) Differential expression analysis for sequence count data. Genome Biol 11(10):R106. https://doi.org/10.1186/gb-2010-11-10-r106

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Kaufman L, Rousseeuw PJ (2009) Finding groups in data: an introduction to cluster analysis, vol 344. John Wiley & Sons, Hoboken, NJ

    Google Scholar 

  25. Maaten Lvd HG (2008) Visualizing data using t-SNE. J Mach Learn Res 9(Nov):2579–2605

    Google Scholar 

  26. Blondel VD, Guillaume J-L, Lambiotte R, Lefebvre E (2008) Fast unfolding of communities in large networks. J Stat Mech Theory Exp 2008(10):P10008

    Article  Google Scholar 

  27. Zhang JM, Kamath GM, Tse DN (2019) Valid post-clustering differential analysis for single-cell RNA-Seq. Cell Syst 9(4):383–392.e386. https://doi.org/10.1016/j.cels.2019.07.012

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Soneson C, Robinson MD (2018) Bias, robustness and scalability in single-cell differential expression analysis. Nat Methods 15(4):255–261. https://doi.org/10.1038/nmeth.4612

    Article  CAS  PubMed  Google Scholar 

  29. Sun S, Zhu J, Ma Y, Zhou X (2019) Accuracy, robustness and scalability of dimensionality reduction methods for single-cell RNA-seq analysis. Genome Biol 20(1):269

    Article  CAS  Google Scholar 

  30. Duò A, Robinson MD, Soneson C (2018) A systematic performance evaluation of clustering methods for single-cell RNA-seq data. F1000Res 7

    Google Scholar 

  31. Haghverdi L, Lun ATL, Morgan MD, Marioni JC (2018) Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat Biotechnol 36(5):421–427. https://doi.org/10.1038/nbt.4091

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Davide Risso .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Science+Business Media, LLC, part of Springer Nature

About this protocol

Check for updates. Verify currency and authenticity via CrossMark

Cite this protocol

Risso, D. (2021). Normalization of Single-Cell RNA-Seq Data. In: Picardi, E. (eds) RNA Bioinformatics. Methods in Molecular Biology, vol 2284. Humana, New York, NY. https://doi.org/10.1007/978-1-0716-1307-8_17

Download citation

  • DOI: https://doi.org/10.1007/978-1-0716-1307-8_17

  • Published:

  • Publisher Name: Humana, New York, NY

  • Print ISBN: 978-1-0716-1306-1

  • Online ISBN: 978-1-0716-1307-8

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics