Skip to main content

Genome-Wide Canonical Correlation Analysis-Based Computational Methods for Mining Information from Microbiome and Gene Expression Data

  • Conference paper
  • First Online:
Advances in Artificial Intelligence (Canadian AI 2019)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11489))

Included in the following conference series:

Abstract

Multi-omics datasets are very high-dimensional in nature and have relatively fewer number of samples compared to the number of features. Canonical correlation analysis (CCA)-based methods are commonly used for reducing the dimensions of such multi-view (multi-omics) datasets to test the associations among the features from different views and to make them suitable for downstream analyses (classification, clustering etc.). However, most of the CCA approaches suffer from lack of interpretability and result in poor performance in the downstream analyses. Presently, there is no well-explored comparison study for CCA methods with application to multi-omics datasets (such as microbiome and gene expression datasets). In this study, we address this gap by providing a detail comparison study of three popular CCA approaches: regularized canonical correlation analysis (RCC), deep canonical correlation analysis (DCCA), and sparse canonical correlation analysis (SCCA) using a multi-omics dataset consisting of microbiome and gene expression profiles. We evaluated the methods in terms of the total correlation score, and the classification performance. We found that the SCCA provides reasonable correlation scores in the reduced space, enables interpretability, and also provides the best classification performance among the three methods.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 69.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Hasin, Y., Seldin, M., Lusis, A.: Multi-omics approaches to disease Genome Biol. 18(1), 83 (2017)

    Google Scholar 

  2. Hotelling, H.: Relations between two sets of variates. Biometrika 28, 321–377 (1936)

    Google Scholar 

  3. Vinod, H.D.: Canonical ridge and econometrics of joint production. J. Econom. 4(2), 147–166 (1976)

    Google Scholar 

  4. Leurgans, S.E., Moyeed, R.A., Silverman, B.W.: Canonical correlation analysis when the data are curves. J. R. Stat. Soc. Ser. B. 55(3), 725–740 (1993)

    Google Scholar 

  5. Andrew, G., Arora, R., Bilmes, J.A., Livescu, K.: Deep canonical correlation analysis. In: ICML (2013)

    Google Scholar 

  6. Wang, W., Arora, R., Livescu, K., Bilmes, J.: On deep multi-view representation learning. In: International Conference on Machine Learning, pp. 1083–1092 (2015)

    Google Scholar 

  7. Hardoon, D.R., Shawe-Taylor, J.: Sparse canonical correlation analysis. Mach. Learn. 83(3), 331–353 (2011)

    Google Scholar 

  8. Parkhomenko, E., Tritchler, D., Beyene, J.: Sparse canonical correlation analysis with application to genomic data integration. Stat. Appl. Genet. Mol. Biol. 8(1), 1–34 (2009)

    Google Scholar 

  9. Witten, D.M., Tibshirani, R., Hastie, T.: A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. Biostatistics 10(3), 515–534 (2009)

    Google Scholar 

  10. Gonzalez, I., Déjean, S., Martin, P., Baccini, A.: CCA: an R package to extend canonical correlation analysis. J. Stat. Softw. 23(12), 1–14 (2008)

    Google Scholar 

  11. Morgan, X.C., et al.: Associations between host gene expression, the mucosal microbiome, and clinical outcome in the pelvic pouch of patients with inflammatory bowel disease. Genome Biol. 16(1), 67 (2015)

    Google Scholar 

  12. Noroozi, V.: VahidooX/DeepCCA. https://github.com/VahidooX/DeepCCA

  13. Witten, D., Tibshirani, R., Gross, S., Narasimhan, B., Witten, M.D.: Package ‘pma’. Genet. Mol. Biol. 8, 28 (2013)

    Google Scholar 

Download references

Acknowledgements

This work was supported in part by Natural Sciences and Engineering Research Council of Canada, Manitoba Health Research Council and University of Manitoba.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Pingzhao Hu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Shikder, R., Irani, P., Hu, P. (2019). Genome-Wide Canonical Correlation Analysis-Based Computational Methods for Mining Information from Microbiome and Gene Expression Data. In: Meurs, MJ., Rudzicz, F. (eds) Advances in Artificial Intelligence. Canadian AI 2019. Lecture Notes in Computer Science(), vol 11489. Springer, Cham. https://doi.org/10.1007/978-3-030-18305-9_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-18305-9_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-18304-2

  • Online ISBN: 978-3-030-18305-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics