Skip to main content

Feature Selection and Assessment of Lung Cancer Sub-types by Applying Predictive Models

  • Conference paper
  • First Online:
Advances in Computational Intelligence (IWANN 2019)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11507))

Included in the following conference series:

Abstract

The main goal of this study is the identification of a robust set of genes having the capability of discerning among the different sub-types of lung cancer: Small Cell Lung Carcinoma (SCLC), Adenocarcinoma (ACC), Squamous Cell Carcinoma (SCC) and Large Cell Lung Carcinoma (LCLC). To achieve this goal, an overall differentially expressed genes analysis was performed by using data from gene expression microarrays publicly stored at NCBI/GEO platform. Once the analysis was done, a total of 60 Differential Expressed Genes (DEGs) were selected and then used in the development of predictive models combining supervised machine learning and feature selection algorithms. This provided a reduced and specific gene signature that allows identifying the sub-type of lung cancer of new samples. The predictive models designed are assessed in terms of accuracy, f1-score, sensitivity and specificity. Finally, a set of public web platforms having biological information on genes, were used in order to determine the relation that exists between the final subset of genes and the addressed sub-types of lung cancer.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bray, F., Ferlay, J., Soerjomataram, I., Siegel, R.L., Torre, L.A., Jemal, A.: Global cancer statistics 2018: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA: Cancer J. Clin. 68(6), 394–424 (2018)

    Google Scholar 

  2. Cooper, W.A., et al.: The textbook on Lung Cancer: time for personalized medicine. Ann. Transl. Med. 3(7), 86 (2015)

    Google Scholar 

  3. Schena, M., Shalon, D., Davis, R.W., Brown, P.O.: Quantitative monitoring of gene expression patterns with a complementary DNA Microarray. Science 270(5235), 467 (1995)

    Article  Google Scholar 

  4. Sanchez Palencia, A., et al.: Gene expression profiling reveals novel biomarkers in nonsmall cell lung cancer. Int. J. Cancer 129(2), 355–364 (2011)

    Article  Google Scholar 

  5. Yanaihara, N., et al.: Unique microRNA molecular profiles in lung cancer diagnosis and prognosis. Cancer Cell 9(3), 189–198 (2006)

    Article  Google Scholar 

  6. Barrett, T., Troup, D.B., Wilhite, S.E., Ledoux, P., Rudnev, D., Evangelista, C., et al.: NCBI GEO: mining tens of millions of expression profiles database and tools update. Nucl. Acids Res. 35(suppl. 1), D760–D765 (2007)

    Article  Google Scholar 

  7. R Core Team: R: A language and environment for statistical computing (2013)

    Google Scholar 

  8. Gentleman, R.C., Carey, V.J., Bates, D.M., Bolstad, B., Dettling, M., Dudoit, S., et al.: Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 5(10), R80 (2004)

    Article  Google Scholar 

  9. Galvez, J.M., Castillo, D., Herrera, L.J., Roman, B.S., Valenzuela, O., Ortuno, F.M., et al.: Multiclass classification for skin cancer profiling based on the integration of heterogeneous gene expression series. PLoS ONE 13(5), 1V (2018). https://doi.org/10.1371/journal.pone.0196836

    Article  Google Scholar 

  10. Smyth, G.K.: Limma: linear models for Microarray data. In: Gentleman, R., Carey, V.J., Huber, W., Irizarry, R.A., Dudoit, S. (eds.) Bioinformatics and computational biology solutions using R and Bioconductor. SBH, pp. 397–420. Springer, New York (2005). https://doi.org/10.1007/0-387-29362-0_23

    Chapter  Google Scholar 

  11. Ding, C., Peng, H.: Minimum redundancy feature selection from microarray gene expression data. J. Bioinform. Comput. Biol. 3(02), 185–205 (2005)

    Article  Google Scholar 

  12. Hira, Z.M., Gillies, D.F.: A review of feature selection and feature extraction methods applied on microarrays data. Adv. Bioinform. 2015, 13 (2015)

    Article  Google Scholar 

  13. Diaz Uriarte, R., de Andres, S.A.: Gene Selection and classification of microarray data using Random forest. BMC Bioinform. 7, 3 (2006)

    Article  Google Scholar 

  14. Cortes, C., Vapnik, V.: Support vector networks. Mach. Learn. 20(3), 273–297 (1995)

    MATH  Google Scholar 

  15. Noble, W.S.: What is a support vector machine? Nature Biotechnol. 24, 1565–1567 (2006)

    Article  Google Scholar 

  16. Parry, R., Jones, W., Stokes, T., Phan, J., Moffitt, R., Fang, H., et al.: K nearest neighbor models for Microarray gene expression analysis and clinical outcome prediction. Pharmacogenomics J. 10(4), 292 (2010)

    Article  Google Scholar 

  17. Carvalho-Silva, D., et al.: Open Targets Platform: new developments and updates two years on. Nucl. Acids Res. 47(D1), D1056–D1065 (2019). https://doi.org/10.1093/nar/gky1133

    Article  Google Scholar 

  18. Safran, M., et al.: GeneCards Version 3: the human gene integrator. Database 2010, baq020 (2010)

    Article  Google Scholar 

  19. Chen, Z., et al.: cAMP/CREB-regulated LINC00473 marks LKB1-inactivated lung cancer and mediates tumor growth. J. Clin. invest. 126(6), 2267–2279 (2016)

    Article  Google Scholar 

  20. Savci-Heijink, C.D., Kosari, F., Aubry, M.C., Caron, B.L., Sun, Z., Yang, P., Vasmatzis, G.: The role of desmoglein-3 in the diagnosis of squamous cell carcinoma of the lung. Am. J. Pathol. 174(5), 1629–1637 (2009)

    Article  Google Scholar 

  21. Saaber, F., Chen, Y., Cui, T., Yang, L., Mireskandari, M., Petersen, I.: Expression of desmogleins 13 and their clinical impacts on human lung cancer. Pathol.-Res. Pract. 211(3), 208–213 (2015)

    Article  Google Scholar 

  22. Zhang, F., et al.: Identification of key transcription factors associated with lung squamous cell carcinoma. Med. Sci. Monit.: Int. Med. J. Exp. Clin. Res. 23, 172 (2017)

    Article  Google Scholar 

  23. Chen, Z., et al.: MiR-195 suppresses non-small cell lung cancer by targeting CHEK1. Oncotarget 6(11), 9445 (2016)

    Google Scholar 

  24. Cui, T., et al.: The p53 target gene desmocollin 3 acts as a novel tumor suppressor through inhibiting EGFR/ERK pathway in human lung cancer. Carcinogenesis 33(12), 2326–2333 (2012)

    Article  Google Scholar 

  25. Frezzetti, D., et al.: Vascular endothelial growth factor a regulates the secretion of different angiogenic factors in lung cancer cells. J. Cell. Physiol. 231(7), 1514–1521 (2016)

    Article  Google Scholar 

  26. Wang, Z., Gerstein, M., Snyder, M.: RNA-Seq: a revolutionary tool for transcriptomics. Nature Rev. Genet. 10(1), 57–63 (2009)

    Article  Google Scholar 

  27. Castillo, D., Galvez, J.M., Herrera, L.J., Roman, B.S., Rojas, F., Rojas, I.: Integration of RNA-Seq data with heterogeneous Microarray data for breast cancer profiling. BMC Bioinform. 18(1), 506 (2017). https://doi.org/10.1186/s12859-017-1925-0

    Article  Google Scholar 

  28. Castillo, D., et al.: Leukemia multiclass assessment and classification from Microarray and RNA-Seq technologies integration at gene expression level. PLoS ONE (2019). https://doi.org/10.1371/journal.pone.0212127

    Article  Google Scholar 

Download references

Acknowledgements

This research has been possible thanks to the support of project: TIN2015-71873-R (Spanish Ministry of Economy and Competitiveness – MINECO – and the European Regional Development Fund – ERDF).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Daniel Castillo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

González, S., Castillo, D., Galvez, J.M., Rojas, I., Herrera, L.J. (2019). Feature Selection and Assessment of Lung Cancer Sub-types by Applying Predictive Models. In: Rojas, I., Joya, G., Catala, A. (eds) Advances in Computational Intelligence. IWANN 2019. Lecture Notes in Computer Science(), vol 11507. Springer, Cham. https://doi.org/10.1007/978-3-030-20518-8_73

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-20518-8_73

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-20517-1

  • Online ISBN: 978-3-030-20518-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics