Skip to main content

Computational Analysis Workflows for Omics Data Interpretation

  • Protocol
  • First Online:
Bioinformatics for Omics Data

Part of the book series: Methods in Molecular Biology ((MIMB,volume 719))

Abstract

Progress in experimental procedures has led to rapid availability of Omics profiles. Various open-access as well as commercial tools have been developed for storage, analysis, and interpretation of transcriptomics, proteomics, and metabolomics data. Generally, major analysis steps include data storage, retrieval, preprocessing, and normalization, followed by identification of differentially expressed features, functional annotation on the level of biological processes and molecular pathways, as well as interpretation of gene lists in the context of protein–protein interaction networks. In this chapter, we discuss a sequential transcriptomics data analysis workflow utilizing open-source tools, specifically exemplified on a gene expression dataset on familial hypercholesterolemia.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Wittner, B. S., Sgroi, D. C., Ryan, P. D., Bruinsma, T. J., Glas, A. M., Male, A., Dahiya, S., Habin, K., Bernards, R., Haber, D. A., Van’t Veer, L. J., and Ramaswamy, S. (2008) Analysis of the MammaPrint breast cancer assay in a predominantly postmenopausal cohort. Clin Cancer Res 14, 2988–93.

    Article  PubMed  CAS  Google Scholar 

  2. Perco, P., Rapberger, R., Siehs, C., Lukas, A., Oberbauer, R., Mayer, G., and Mayer, B. (2006) Transforming omics data into context: bioinformatics on genomics and proteomics raw data. Electrophoresis 27, 2659–75.

    Article  PubMed  CAS  Google Scholar 

  3. Parkinson, H., Kapushesky, M., Shojatalab, M., Abeygunawardena, N., Coulson, R., Farne, A., Holloway, E., Kolesnykov, N., Lilja, P., Lukk, M., Mani, R., Rayner, T., Sharma, A., William, E., Sarkans, U., and Brazma, A. (2007) ArrayExpress – a public database of microarray experiments and gene expression profiles. Nucleic Acids Res 35, D747–50.

    Article  PubMed  CAS  Google Scholar 

  4. Barrett, T., Troup, D. B., Wilhite, S. E., Ledoux, P., Rudnev, D., Evangelista, C., Kim, I. F., Soboleva, A., Tomashevsky, M., Marshall, K. A., Phillippy, K. H., Sherman, P. M., Muertter, R. N., and Edgar, R. (2009) NCBI GEO: archive for high-throughput functional genomic data. Nucleic Acids Res 37, D885–90.

    Article  PubMed  CAS  Google Scholar 

  5. Demeter, J., Beauheim, C., Gollub, J., Hernandez-Boussard, T., Jin, H., Maier, D., Matese, J. C., Nitzberg, M., Wymore, F., Zachariah, Z. K., Brown, P. O., Sherlock, G., and Ball, C. A. (2007) The Stanford Microarray Database: implementation of new analysis tools and open source release of software. Nucleic Acids Res 35, D766–70.

    Article  PubMed  CAS  Google Scholar 

  6. Hoogland, C., Mostaguir, K., Sanchez, J. C., Hochstrasser, D. F., and Appel, R. D. (2004) SWISS-2DPAGE, ten years later. Proteomics 4, 2352–6.

    Article  PubMed  CAS  Google Scholar 

  7. Smolka, M., Zhou, H., and Aebersold, R. (2002) Quantitative protein profiling using two-dimensional gel electrophoresis, isotope-coded affinity tag labeling, and mass spectrometry. Mol Cell Proteomics 1, 19–29.

    Article  PubMed  CAS  Google Scholar 

  8. Brazma, A., Hingamp, P., Quackenbush, J., Sherlock, G., Spellman, P., Stoeckert, C., Aach, J., Ansorge, W., Ball, C. A., Causton, H. C., Gaasterland, T., Glenisson, P., Holstege, F. C., Kim, I. F., Markowitz, V., Matese, J. C., Parkinson, H., Robinson, A., Sarkans, U., Schulze-Kremer, S., Stewart, J., Taylor, R., Vilo, J., and Vingron, M. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data. Nat Genet 29, 365–71.

    Article  PubMed  CAS  Google Scholar 

  9. Irizarry, R. A., Hobbs, B., Collin, F., Beazer-Barclay, Y. D., Antonellis, K. J., Scherf, U., and Speed, T. P. (2003) Exploration, normalization, and summaries of high density oligonucleotide array probe level data. Biostatistics 4, 249–64.

    Article  PubMed  Google Scholar 

  10. Affymetrix (2001) Statistical algorithms reference guide, Technical Report. Technical Report, Affymetrix.

    Google Scholar 

  11. Schadt, E. E., Li, C., Ellis, B., and Wong, W. H. (2001) Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data. J Cell Biochem Suppl Suppl 37, 120–5.

    Article  PubMed  CAS  Google Scholar 

  12. Li, C., and Wong, W. H. (2001) Model-based analysis of oligonucleotide arrays: model validation, design issues and standard error application. Genome Biol 2, RESEARCH0032.

    PubMed  CAS  Google Scholar 

  13. Troyanskaya, O., Cantor, M., Sherlock, G., Brown, P., Hastie, T., Tibshirani, R., Botstein, D., and Altman, R. B. (2001) Missing value estimation methods for DNA microarrays. Bioinformatics 17, 520–5.

    Article  PubMed  CAS  Google Scholar 

  14. Zhou, X., Wang, X., and Dougherty, E. R. (2003) Missing-value estimation using linear and non-linear regression with Bayesian gene selection. Bioinformatics 19, 2302–7.

    Article  PubMed  CAS  Google Scholar 

  15. Bo, T. H., Dysvik, B., and Jonassen, I. (2004) LSimpute: accurate estimation of missing values in microarray data with least squares methods. Nucleic Acids Res 32, e34.

    Article  PubMed  Google Scholar 

  16. Jornsten, R., Wang, H. Y., Welsh, W. J., and Ouyang, M. (2005) DNA microarray data imputation and significance analysis of differential expression. Bioinformatics 21, 4155–61.

    Article  PubMed  Google Scholar 

  17. Nie, L., Wu, G., and Zhang, W. (2008) Statistical application and challenges in global gel-free proteomic analysis by mass spectrometry. Crit Rev Biotechnol 28, 297–307.

    Article  PubMed  CAS  Google Scholar 

  18. Grosse-Coosmann, F., Boehm, A. M., and Sickmann, A. (2005) Efficient analysis and extraction of MS/MS result data from Mascot result files. BMC Bioinformatics 6, 290.

    Article  PubMed  Google Scholar 

  19. Diehn, M., Sherlock, G., Binkley, G., Jin, H., Matese, J. C., Hernandez-Boussard, T., Rees, C. A., Cherry, J. M., Botstein, D., Brown, P. O., and Alizadeh, A. A. (2003) SOURCE: a unified genomic resource of functional annotations, ontologies, and gene expression data. Nucleic Acids Res 31, 219–23.

    Article  PubMed  CAS  Google Scholar 

  20. Safran, M., Chalifa-Caspi, V., Shmueli, O., Olender, T., Lapidot, M., Rosen, N., Shmoish, M., Peter, Y., Glusman, G., Feldmesser, E., Adato, A., Peter, I., Khen, M., Atarot, T., Groner, Y., and Lancet, D. (2003) Human Gene-Centric Databases at the Weizmann Institute of Science: GeneCards, UDB, CroW 21 and HORDE. Nucleic Acids Res 31, 142–6.

    Article  PubMed  CAS  Google Scholar 

  21. Westfall, P. H., and Young, S. S. (1993) in Wiley series in probability and mathematical statistics. Wiley, New York.

    Google Scholar 

  22. Dudoit, S., Shaffer, J. P., and Boldrick, J. C. (2003) Multiple hypothesis testing in microarray experiments. Statistical Science 19, 1090–9.

    Google Scholar 

  23. Ge, Y., Dudoit, S., and Speed, T. P. (2003) Resampling-based multiple testing for microarray data analysis. TEST 12, 1–44.

    Google Scholar 

  24. van der Laan, M. J., Dudoit, S., and Pollard, K. S. (2004) Multiple testing. Part II. Step-down procedures for control of the family-wise error rate. Stat Appl Genet Mol Biol 3, Article14.

    Google Scholar 

  25. Gentleman, R. C., Carey, V. J., Bates, D. M., Bolstad, B., Dettling, M., Dudoit, S., Ellis, B., Gautier, L., Ge, Y., Gentry, J., Hornik, K., Hothorn, T., Huber, W., Iacus, S., Irizarry, R., Leisch, F., Li, C., Maechler, M., Rossini, A. J., Sawitzki, G., Smith, C., Smyth, G., Tierney, L., Yang, J. Y., and Zhang, J. (2004) Bioconductor: open software development for computational biology and bioinformatics. Genome Biol 5, R80.

    Article  PubMed  Google Scholar 

  26. Efron, B., and Tibshirani, R. J. (1993) An introduction to the bootstrap. Chapman and Hall, New York.

    Google Scholar 

  27. Tusher, V. G., Tibshirani, R., and Chu, G. (2001) Significance analysis of microarrays applied to the ionizing radiation response. Proc Natl Acad Sci USA 98, 5116–21.

    Article  PubMed  CAS  Google Scholar 

  28. Saeed, A. I., Sharov, V., White, J., Li, J., Liang, W., Bhagabati, N., Braisted, J., Klapa, M., Currier, T., Thiagarajan, M., Sturn, A., Snuffin, M., Rezantsev, A., Popov, D., Ryltsov, A., Kostukovich, E., Borisovsky, I., Liu, Z., Vinsavich, A., Trush, V., and Quackenbush, J. (2003) TM4: a free, open-source system for microarray data management and analysis. Biotechniques 34, 374–8.

    PubMed  CAS  Google Scholar 

  29. Khatri, P., and Draghici, S. (2005) Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–95.

    Article  PubMed  CAS  Google Scholar 

  30. Huang da, W., Sherman, B. T., and Lempicki, R. A. (2009) Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat Protoc 4, 44–57.

    Article  CAS  Google Scholar 

  31. Kanehisa, M., Goto, S., Kawashima, S., and Nakaya, A. (2002) The KEGG databases at GenomeNet. Nucleic Acids Res 30, 42–6.

    Article  PubMed  CAS  Google Scholar 

  32. Mi, H., Lazareva-Ulitsky, B., Loo, R., Kejariwal, A., Vandergriff, J., Rabkin, S., Guo, N., Muruganujan, A., Doremieux, O., Campbell, M. J., Kitano, H., and Thomas, P. D. (2005) The PANTHER database of protein families, subfamilies, functions and pathways. Nucleic Acids Res 33, D284–8.

    Article  PubMed  CAS  Google Scholar 

  33. Joshi-Tope, G., Gillespie, M., Vastrik, I., D’Eustachio, P., Schmidt, E., de Bono, B., Jassal, B., Gopinath, G. R., Wu, G. R., Matthews, L., Lewis, S., Birney, E., and Stein, L. (2005) Reactome: a knowledgebase of biological pathways. Nucleic Acids Res 33, D428–32.

    Article  PubMed  CAS  Google Scholar 

  34. Antonov, A. V., Dietmann, S., and Mewes, H. W. (2008) KEGG spider: interpretation of genomics data in the context of the global gene metabolic network. Genome Biol 9, R179.

    Article  PubMed  Google Scholar 

  35. Portales-Casamar, E., Thongjuea, S., Kwon, A. T., Arenillas, D., Zhao, X., Valen, E., Yusuf, D., Lenhard, B., Wasserman, W. W., and Sandelin, A. (2010) JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles. Nucleic Acids Res 38, D105–10.

    Article  PubMed  CAS  Google Scholar 

  36. Ho Sui, S. J., Mortimer, J. R., Arenillas, D. J., Brumm, J., Walsh, C. J., Kennedy, B. P., and Wasserman, W. W. (2005) oPOSSUM: ­identification of over-represented transcription factor binding sites in co-expressed genes. Nucleic Acids Res 33, 3154–64.

    Article  PubMed  Google Scholar 

  37. von Mering, C., Jensen, L. J., Kuhn, M., Chaffron, S., Doerks, T., Kruger, B., Snel, B., and Bork, P. (2007) STRING 7 – recent developments in the integration and prediction of protein interactions. Nucleic Acids Res 35, D358–62.

    Article  Google Scholar 

  38. Jensen, L. J., Kuhn, M., Stark, M., Chaffron, S., Creevey, C., Muller, J., Doerks, T., Julien, P., Roth, A., Simonovic, M., Bork, P., and von Mering, C. (2009) STRING 8 – a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res 37, D412–6.

    Article  PubMed  CAS  Google Scholar 

  39. Alexeyenko, A., and Sonnhammer, E. L. (2009) Global networks of functional coupling in eukaryotes from comprehensive data integration. Genome Res 19, 1107–16.

    Article  PubMed  CAS  Google Scholar 

  40. Bernthaler, A., Muhlberger, I., Fechete, R., Perco, P., Lukas, A., and Mayer, B. (2009) A dependency graph approach for the analysis of differential gene expression profiles. Mol Biosyst 5, 1720–31.

    Article  PubMed  CAS  Google Scholar 

  41. Kersey, P. J., Duarte, J., Williams, A., Karavidopoulou, Y., Birney, E., and Apweiler, R. (2004) The International Protein Index: an integrated database for proteomics experiments. Proteomics 4, 1985–8.

    Article  PubMed  CAS  Google Scholar 

  42. Mosig, S., Rennert, K., Buttner, P., Krause, S., Lutjohann, D., Soufi, M., Heller, R., and Funke, H. (2008) Monocytes of patients with familial hypercholesterolemia show alterations in cholesterol metabolism. BMC Med Genomics 1, 60.

    Article  PubMed  Google Scholar 

  43. Rainer, J., Sanchez-Cabo, F., Stocker, G., Sturn, A., and Trajanoski, Z. (2006) CARMAweb: comprehensive R- and ­bioconductor-based web service for microarray data analysis. Nucleic Acids Res 34, W498–503.

    Article  PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Paul Perco .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

MĂĽhlberger, I., Wilflingseder, J., Bernthaler, A., Fechete, R., Lukas, A., Perco, P. (2011). Computational Analysis Workflows for Omics Data Interpretation. In: Mayer, B. (eds) Bioinformatics for Omics Data. Methods in Molecular Biology, vol 719. Humana Press. https://doi.org/10.1007/978-1-61779-027-0_17

Download citation

  • DOI: https://doi.org/10.1007/978-1-61779-027-0_17

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-61779-026-3

  • Online ISBN: 978-1-61779-027-0

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics