Abstract
Dimensionality reduction is an important first step in the analysis of single cell RNA-seq (scRNA-seq) data. In addition to enabling the visualization of the profiled cells, such representations are used by many downstream analyses methods ranging from pseudo-time reconstruction to clustering to alignment of scRNA-seq data from different experiments, platforms, and labs. Both supervised and unsupervised methods have been proposed to reduce the dimension of scRNA-seq. However, all methods to date are sensitive to batch effects. When batches correlate with cell types, as is often the case, their impact can lead to representations that are batch rather than cell type specific. To overcome this we developed a domain adversarial neural network model for learning a reduced dimension representation of scRNA-seq data. The adversarial model tries to simultaneously optimize two objectives. The first is the accuracy of cell type assignment and the second is the inability to distinguish the batch (domain). We tested the method by using the resulting representation to align several different datasets. As we show, by overcoming batch effects our method was able to correctly separate cell types, improving on several prior methods suggested for this task. Analysis of the top features used by the network indicates that by taking the batch impact into account, the reduced representation is much better able to focus on key genes for each cell type.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Inazu, A.: Plasma cholesteryl ester transfer protein (CETP) in relation to human pathophysiology (Chap. 3). In: Komoda, T. (ed.) The HDL Handbook, pp. 35–59. Academic Press, Boston (2010)
Integration and label transfer - standard workflow, October 2019. https://satijalab.org/seurat/v3.1/integration.html#standard-workflow
Alavi, A., Ruffalo, M., Parvangada, A., Huang, Z., Bar-Joseph, Z.: A web server for comparative analysis of single-cell RNA-seq data. Nat. Commun. 9(1), 4768 (2018)
Chu, C., Wang, R.: A survey of domain adaptation for neural machine translation. arXiv preprint arXiv:1806.00258 (2018)
Csurka, G.: Domain adaptation for visual applications: a comprehensive survey. arXiv preprint arXiv:1702.05374 (2017)
Ding, J., et al.: Systematic comparative analysis of single cell RNA-seq methods. BioRxiv, p. 632216 (2019)
Domingo-Espín, J., Nilsson, O., Bernfur, K., Giudice, R.D., Lagerstedt, J.O.: Site-specific glycations of apolipoprotein A-I lead to differentiated functional effects on lipid-binding and on glucose metabolism. Biochimica et Biophysica Acta (BBA) Mol. Basis Dis. 1864(9, Part B), 2822–2834 (2018)
Eng, C.H.L., et al.: Transcriptome-scale super-resolved imaging in tissues by RNA seqFISH+. Nature 568(7751), 235 (2019)
Erhan, D., Bengio, Y., Courville, A., Vincent, P.: Visualizing higher-layer features of a deep network. Univ. Montreal 1341(3), 1 (2009)
Ganin, Y., et al.: Domain-adversarial training of neural networks. J. Mach. Learn. Res. 17(1), 1–35 (2016)
Ge, S., Wang, H., Alavi, A., Xing, E., Bar-Joseph, Z.: Supporting information for: Supervised adversarial alignment of scRNA-seq data. bioRxiv (2020). https://doi.org/10.1101/2020.01.06.896621v1.full.pdf
Hadsell, R., Chopra, S., LeCun, Y.: Dimensionality reduction by learning an invariant mapping. In: 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2006), vol. 2, pp. 1735–1742. IEEE (2006)
Haghverdi, L., Lun, A.T., Morgan, M.D., Marioni, J.C.: Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors. Nat. Biotechnol. 36(5), 421 (2018)
Hara, T., Tan, Y., Huang, L.: In vivo gene delivery to the liver using reconstituted chylomicron remnants as a novel nonviral vector. Proc. Natl. Acad. Sci. 94(26), 14547–14552 (1997)
Hooker, S., Erhan, D., Kindermans, P.J., Kim, B.: Evaluating feature importance estimates. arXiv preprint arXiv:1806.10758 (2018)
Hwang, B., Lee, J.H., Bang, D.: Single-cell RNA sequencing technologies and bioinformatics pipelines. Exp. Mol. Med. 50(8), 1–14 (2018)
Jaitin, D.A., et al.: Massively parallel single-cell RNA-seq for marker-free decomposition of tissues into cell types. Science 343(6172), 776–779 (2014)
Kiselev, V.Y., Yiu, A., Hemberg, M.: scmap: projection of single-cell RNA-seq data across data sets. Nat. Methods 15(5), 359 (2018)
Ko, H.L., Wang, Y.S., Fong, W.L., Chi, M.S., Chi, K.H., Kao, S.J.: Apolipoprotein C1 (APOC 1) as a novel diagnostic and prognostic biomarker for lung cancer: a marker phase I trial. Thorac. Cancer 5(6), 500–508 (2014)
Koch, G., Zemel, R., Salakhutdinov, R.: Siamese neural networks for one-shot image recognition. In: ICML Deep Learning Workshop, vol. 2 (2015)
Li, H., Pan, S.J., Wang, S., Kot, A.C.: Domain generalization with adversarial feature learning. In: CVPR (2018)
Lieberman, Y., Rokach, L., Shay, T.: Castle-classification of single cells by transfer learning: harnessing the power of publicly available single cell RNA sequencing experiments to annotate new experiments. PLoS One 13(10), e0205499 (2018)
Lin, C., Jain, S., Kim, H., Bar-Joseph, Z.: Using neural networks for reducing the dimensions of single-cell RNA-seq data. Nucleic Acids Res. 45(17), e156 (2017)
Lopez, R., Regier, J., Cole, M.B., Jordan, M.I., Yosef, N.: Deep generative modeling for single-cell transcriptomics. Nat. Methods 15(12), 1053 (2018)
Motiian, S., Piccirilli, M., Adjeroh, D.A., Doretto, G.: Unified deep supervised domain adaptation and generalization. In: ICCV, vol. 2, p. 3 (2017)
Murakami, T., et al.: Triglycerides are major determinants of cholesterol esterification/transfer and HDL remodeling in human plasma. Arterioscler. Thromb. Vasc. Biol. 15(11), 1819–1828 (1995)
Papalexi, E., Satija, R.: Single-cell RNA sequencing to explore immune cell heterogeneity. Nat. Rev. Immunol. 18(1), 35 (2018)
Patel, V.M., Gopalan, R., Li, R., Chellappa, R.: Visual domain adaptation: a survey of recent advances. IEEE Signal Process. Mag. 32(3), 53–69 (2015)
Pei, Z., Cao, Z., Long, M., Wang, J.: Multi-adversarial domain adaptation. In: AAAI Conference on Artificial Intelligence (2018)
Redgrave, T.: Chylomicron metabolism. Biochem. Soc. Trans. 32(1), 79–82 (2004). https://doi.org/10.1042/bst0320079
Ribeiro, M.T., Singh, S., Guestrin, C.: Why should i trust you?: explaining the predictions of any classifier. In: SIGKDD, pp. 1135–1144. ACM (2016)
Seidman, M.A., Mitchell, R.N., Stone, J.R.: Pathophysiology of atherosclerosis (Chap. 12). In: Willis, M.S., Homeister, J.W., Stone, J.R. (eds.) Cellular and Molecular Pathobiology of Cardiovascular Disease, pp. 221–237. Academic Press, San Diego (2014)
Simonyan, K., Vedaldi, A., Zisserman, A.: Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv preprint arXiv:1312.6034 (2013)
Springenberg, J.T., Dosovitskiy, A., Brox, T., Riedmiller, M.: Striving for simplicity: the all convolutional net. arXiv preprint arXiv:1412.6806 (2014)
Steiner, B., et al.: Pytorch: An imperative style, high-performance deep learning library. In: NeurIPS, vol. 32 (2019)
Stuart, T., et al.: Comprehensive integration of single-cell data. Cell 177, 1888–1902 (2019)
Stuart, T., Satija, R.: Integrative single-cell analysis. Nat. Rev. Genet. 20, 257–272 (2019)
Tung, P.Y., et al.: Batch effects and the effective design of single-cell gene expression studies. Sci. Rep. 7, 39921 (2017)
Villani, A.C., et al.: Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors. Science 356(6335), eaah4573 (2017)
Wang, G., Moffitt, J.R., Zhuang, X.: Multiplexed imaging of high-density libraries of RNAs with MERFISH and expansion microscopy. Sci. Rep. 8(1), 4847 (2018)
Wang, H., Ge, S., Xing, E.P., Lipton, Z.C.: Learning robust global representations by penalizing local predictive power. arXiv preprint arXiv:1905.13549 (2019)
Wang, H., He, Z., Lipton, Z.C., Xing, E.P.: Learning robust representations by projecting superficial statistics out. arXiv preprint arXiv:1903.06256 (2019)
Yu, Y., et al.: Single-cell RNA-seq identifies a PD-1 hi ILC progenitor and defines its development pathway. Nature 539(7627), 102 (2016)
Zeisel, A., et al.: Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq. Science 347(6226), 1138–1142 (2015)
Acknowledgements
This work was partially supported by National Institute of Health grants 1R01GM122096 and OT2OD026682 to Z.B.J. and by a Scholars Award in Studying Complex Systems from the James S. McDonnell Foundation to Z.B.J. HW was supported by the National Institutes of Health grants R01-GM093156 and P30-DA035778.
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this paper
Cite this paper
Ge, S., Wang, H., Alavi, A., Xing, E., Bar-Joseph, Z. (2020). Supervised Adversarial Alignment of Single-Cell RNA-seq Data. In: Schwartz, R. (eds) Research in Computational Molecular Biology. RECOMB 2020. Lecture Notes in Computer Science(), vol 12074. Springer, Cham. https://doi.org/10.1007/978-3-030-45257-5_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-45257-5_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-45256-8
Online ISBN: 978-3-030-45257-5
eBook Packages: Computer ScienceComputer Science (R0)