Skip to main content

High-Throughput Approaches to Biomarker Discovery and Challenges of Subsequent Validation

  • Reference work entry
  • First Online:
General Methods in Biomarker Research and their Applications

Abstract

Recently introduced high-throughput technologies are producing unprecedented volumes of biomedical data available for mining and analysis. The early predictions of the imminent breakthroughs in our understanding of human diseases and making predictive diagnostics easy, however, turned out to be largely over optimistic.

We argue that this situation is not coincidental, but rather is caused by the statistical properties of the data collected. A typical high-throughput biological dataset is deeply imbalanced: the data matrix includes many measured quantities or “levels” in a relatively small number of subjects. Thus, any attempt to analyze these datasets would be undermined by so-called “Dimensionality Curse” that may be solved by removing a majority of variables. The feature selection aimed at increasing the classification power may be done using data mining or correlation-based approaches. In this chapter, both theory-driven and data-driven approaches to deal with complexity in biological systems are discussed in details.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 399.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 549.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  • Bartlett JW, Frost C, Mattsson N, Skillbäck T, Blennow K, Zetterberg H, Schott JM. Determining cut-points for Alzheimer’s disease biomarkers: statistical issues, methods and challenges. Biomark Med. 2012;6(4):391–400.

    Article  CAS  PubMed  Google Scholar 

  • Drier Y, Domany E. Do two machine-learning based prognostic signatures for breast cancer capture the same biological processes? PLoS One. 2011;6(3):e17795. doi:10.1371/journal.pone.0017795. http://dx.doi.org/10.1371%2Fjournal.pone.0017795

  • Ein-Dor L, Kela I, Getz G, Givol D, Domany E. Outcome signature genes in breast cancer: is there a unique set? Bioinformatics. 2005;21(2):171–8.

    Article  CAS  PubMed  Google Scholar 

  • Ein-Dor L, Zuk O, Domany E. Thousands of samples are needed to generate a robust gene list for predicting outcome in cancer. Proc Natl Acad Sci U S A. 2006;103(15):5923–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gray MA, Delahunt B, Fowles JR, Weinstein P, Cookes RR, Nacey JN. Demographic and clinical factors as determinants of serum levels of prostate specific antigen and its derivatives. Anticancer Res. 2004;24:2069–72.

    PubMed  Google Scholar 

  • Hekal IA, Ibrahiem E. Obesity-PSA relationship: a new formula. Prostate Cancer Prostatic Dis. 2010;13(2):186–90.

    Article  CAS  PubMed  Google Scholar 

  • Kupershmidt I, Su QJ, Grewal A, Sundaresh S, Halperin I, Flynn J, Shekar M, Wang H, Park J, Cui W, Wall GD, Wisotzkey R, Alag S, Akhtari S, Ronaghi M. Ontology-based meta-analysis of global collections of high-throughput public data. PLoS One. 2010;5(9):e13066. doi:10.1371/journal.pone.0013066. http://dx.doi.org/10.1371%2Fjournal.pone.0013066

  • Mayer G, Heinze G, Mischak H, Hellemons ME, Heerspink HJ, Bakker SJ, de Zeeuw D, Haiduk M, Rossing P, Oberbauer R. Omics-bioinformatics in the context of clinical data. Methods Mol Biol. 2011;719:479–97.

    Article  CAS  PubMed  Google Scholar 

  • McDermott JE, Wang J, Mitchell H, Webb-Robertson BJ, Hafen R, Ramey J, Rodland KD. Challenges in biomarker discovery: combining expert insights with statistical analysis of complex omics data. Expert Opin Med Diagn. 2013;7(1):37–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pyatnitskiy M, Karpova M, Moshkovskii S, Lisitsa A, Archakov A. Clustering mass spectral peaks increases recognition accuracy and stability of SVM-based feature selection. J Proteomics Bioinform. 2010;3:048–54. doi:10.4172/jpb.1000120.

    Article  CAS  Google Scholar 

  • Saeys Y, Inza I, Larraaga P. A review of feature selection techniques in bioinformatics. Bioinformatics. 2007;23(19):2507–17.

    Article  CAS  PubMed  Google Scholar 

  • Sinay YG. Probability theory, an introductory course. Berlin/New York: Springer; 1992.

    Google Scholar 

  • van ’t Veer LJ, Dai H, van de Vijver MJ, He YD, Hart AA, Mao M, Peterse HL, van der Kooy K, Marton MJ, Witteveen AT, Schreiber GJ, Kerkhoven RM, Roberts C, Linsley PS, Bernards R, Friend SH. Gene expression profiling predicts clinical outcome of breast cancer. Nature. 2002;415(6871):530–6.

    Article  Google Scholar 

  • Venet D, Dumont JE, Detours V. Most random gene expression signatures are significantly associated with breast cancer outcome. PLoS Comput Biol. 2011;7(10):e1002240. doi:10.1371/journal.pcbi.1002240. http://dx.doi.org/10.1371%2Fjournal.pcbi.1002240

  • Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, Yang F, Talantov D, Timmermans M, Meijer-van Gelder ME, Yu J, Jatkoe T, Berns EM, Atkins D, Foekens JA. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365(9460):671–9.

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgment

The authors express gratitude to the general support provided by College of Science, George Mason University, a State Contract 14.607.21.0098 dated November 27th, 2014 (Ministry of Science and Education, Russia) and by the Human Proteome Scientific Program of the Federal Agency of Scientific Organizations, Russia.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ancha Baranova .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer Science+Business Media Dordrecht

About this entry

Cite this entry

Veytsman, B., Baranova, A. (2015). High-Throughput Approaches to Biomarker Discovery and Challenges of Subsequent Validation. In: Preedy, V., Patel, V. (eds) General Methods in Biomarker Research and their Applications. Biomarkers in Disease: Methods, Discoveries and Applications. Springer, Dordrecht. https://doi.org/10.1007/978-94-007-7696-8_20

Download citation

Publish with us

Policies and ethics