Skip to main content

On Stability of Feature Selection Based on MALDI Mass Spectrometry Imaging Data and Simulated Biopsy

  • Conference paper
  • First Online:
Current Trends in Biomedical Engineering and Bioimages Analysis (PCBEE 2019)

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 1033))

Included in the following conference series:

Abstract

In this work we analyse MALDI mass spectrometry imaging data for thyroid cancer samples. Such a data, containing information about spatial distribution of proteins/peptides, makes possible to make a virtual analysis how a technique of fine needle aspiration (FNA) biopsy, a routine diagnosis procedure for thyroid, influences the outcome i.e. a set of discriminative features between cancerous and normal tissue. We hypothesised that an impure dataset (consisting of normal cell contaminated cancer samples) would be beneficial in the terms of stable feature selection. We compared several methods of predictor selection on different datasets to perform an in-depth feature ranking stability analysis for thyroid cancer mass spectrometry data. Furthermore we examined the impact of sample contamination level on the selection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Aha, D.W., Bankert, R.L.: A Comparative Evaluation of Sequential Feature Selection Algorithms, pp. 199–206. Springer, New York (1996)

    Chapter  Google Scholar 

  2. Bensz, W., Borys, D., Fujarewicz, K., Herok, K., Jaksik, R., Krasucki, M., Kurczyk, A., Matusik, K., Mrozek, D., Ochab, M., et al.: Integrated system supporting research on environment related cancers. In: Król, D., Madeyski, L., Nguyen, N. (eds.) Recent Developments in Intelligent Information and Database Systems, pp. 399–409. Springer, Cham (2016)

    Chapter  Google Scholar 

  3. Filipczuk, P., Fevens, T., Krzyzak, A., Monczak, R.: Computer-aided breast cancer diagnosis based on the analysis of cytological images of fine needle biopsies. IEEE Trans. Med. Imaging 32(12), 2169–2178 (2013)

    Article  Google Scholar 

  4. Fujarewicz, K., Student, S., Zielański, T., Jakubczak, M., Pieter, J., Pojda, K., Świerniak, A.: Large-scale data classification system based on galaxy server and protected from information leak. In: ACIIDS 2017, pp. 765–773. Springer, Cham (2017)

    Chapter  Google Scholar 

  5. Gaweł, D., Fujarewicz, K.: On the sensitivity of feature ranked lists for large-scale biological data. Math. Biosci. Eng. MBE 10(3), 677–690 (2013)

    MathSciNet  MATH  Google Scholar 

  6. Hand, D.J.: Data Mining Based in part on the article ‘Data mining’ by David Hand, which appeared in the Encyclopedia of Environmetrics. American Cancer Society (2013)

    Google Scholar 

  7. Haury, A.-C., Gestraud, P., Vert, J.-P.: The influence of feature selection methods on accuracy, stability and interpretability of molecular signatures. PLOS ONE 6(12), 1–12 (2011)

    Article  Google Scholar 

  8. Kalousis, A., Prados, J., Hilario, M.: Stability of feature selection algorithms. In: Fifth IEEE International Conference on Data Mining (ICDM 2005), p. 8, November 2005

    Google Scholar 

  9. Kim, Y., Jeon, J., Mejia, S., Yao, C.Q., Ignatchenko, V., Nyalwidhe, J.O., Gramolini, A.O., Lance, R.S., Troyer, D.A., Drake, R.R., Boutros, P.C., Semmes, O.J., Kislinger, T.: Targeted proteomics identifies liquid-biopsy signatures for extracapsular prostate cancer. Nat. Commun. 7, 11906 (2016)

    Article  Google Scholar 

  10. MathWorks. Two sample t-test, 23 March 2019

    Google Scholar 

  11. Nakamura, T., Furukawa, Y., Nakagawa, H., Tsunoda, T., Ohigashi, H., Murata, K., Ishikawa, O., Ohgaki, K., Kashimura, N., Miyamoto, M., Hirano, S., Kondo, S., Katoh, H., Nakamura, Y., Katagiri, T.: Genome-wide CDNA microarray analysis of gene expression profiles in pancreatic cancers using populations of tumor cells and normal ductal epithelial cells selected for purity by laser microdissection. Oncogene 23(13), 2385–2400 (2004)

    Article  Google Scholar 

  12. Oreski, D., Oreski, S., Klicek, B.: Effects of dataset characteristics on the performance of feature selection techniques. Appl. Soft Comput. 52, 109–119 (2017)

    Article  Google Scholar 

  13. Pankratz, D.G., Choi, Y., Imtiaz, U., Fedorowicz, G.M., Anderson, J.D., Colby, T.V., Myers, J.L., Lynch, D.A., Brown, K.K., Flaherty, K.R., Steele, M.P., Groshong, S.D., Raghu, G., Barth, N.M., Walsh, P.S., Huang, J., Kennedy, G.C., Martinez, F.J.: Usual interstitial pneumonia can be detected in transbronchial biopsies using machine learning. Ann. Am. Thoracic Soc. 14(11), 1646–1654 (2017). PMID: 28640655

    Article  Google Scholar 

  14. Pietrowska, M., Diehl, H.C., Mrukwa, G., Kalinowska-Herok, M., Gawin, M., Chekan, M., Elm, J., Drazek, G., Krawczyk, A., Lange, D., Meyer, H.E., Polanska, J., Henkel, C., Widlak, P.: Molecular profiles of thyroid cancer subtypes: classification based on features of tissue revealed by mass spectrometry imaging. Biochimica et Biophysica Acta (BBA) Proteins Proteomics 1865(7), 837–845 (2017). MALDI Imaging

    Article  Google Scholar 

  15. Polanski, A., Marczyk, M., Pietrowska, M., Widlak, P., Polanska, J.: Signal partitioning algorithm for highly efficient gaussian mixture modeling in mass spectrometry. PLOS ONE 10(7), 1–19 (2015)

    Article  Google Scholar 

  16. Psiuk-Maksymowicz, K., Płaczek, A., Jaksik, R., Student, S., Borys, D., Mrozek, D., Fujarewicz, K., Świerniak, A.: A holistic approach to testing biomedical hypotheses and analysis of biomedical data. In: Kozielski, S., Mrozek, D., Kasprowski, P., Małlysiak-Mrozek, B., Kostrzewa, D. (eds.) BDAS 2015, pp. 449–462. Springer, Cham (2015)

    Google Scholar 

  17. Quon, G., Haider, S., Deshwar, A.G., Cui, A., Boutros, P.C., Morris, Q.: Computational purification of individual tumor gene expression profiles leads to significant improvements in prognostic prediction. Genome Med. 5(3), 29 (2013)

    Article  Google Scholar 

  18. Saeys, Y., Inza, I., Larrañaga, P.: A review of feature selection techniques in bioinformatics. Bioinformatics 23(19), 2507–2517 (2007)

    Article  Google Scholar 

  19. Student, S., Fujarewicz, K.: Stable feature selection and classification algorithms for multiclass microarray data. Biol. Direct 7, 33 (2012). 23031190[pmid], PMC3599581[pmcid], 1745-6150-7-33[PII]

    Article  Google Scholar 

  20. Student, S., Fujarewicz, K.: Stable feature selection and classification algorithms for multiclass microarray data. Biol. Direct 7(1), 33 (2012)

    Article  Google Scholar 

  21. Türeci, Ö., Ding, J., Hilton, H., Bian, H., Ohkawa, H., Braxenthaler, M., Seitz, G., Raddrizzani, L., Friess, H., Buchler, M., Sahin, U., Hammer, J.: Computational dissection of tissue contamination for identification of colon cancer-specific expression profiles. FASEB J. 17(3), 376–385 (2003). PMID: 12631577

    Article  Google Scholar 

  22. Wilcoxon, F.: Individual comparisons by ranking methods. Biometrics Bull. 1(6), 80–83 (1945)

    Article  Google Scholar 

  23. Zhang, S., Zhang, C., Yang, Q.: Data preparation for data mining. Appl. Artif. Intell. 17(5–6), 375–381 (2003)

    Article  Google Scholar 

Download references

Acknowledgement

This work was supported by Polish National Centre for Research and Development under Grant Strategmed2/267398/4/NCBR/2015 and Silesian University of Technology Grant 02/010/BK-18/0102. Data analysis was partially carried out using the Biotest Platform developed within Project n. PBS3/B3/32/2015 financed by the Polish National Centre of Research and Development (NCBiR) and described in [2, 4, 16]. Calculations were performed using the infrastructure supported by the computer cluster Ziemowit (www.ziemowit.hpc.polsl.pl) funded by the Silesian BIO-FARMA project No. POIG.02.01.00-00-166/08 and expanded in the POIG.02.03.01-00-040/13 in the Computational Biology and Bioinformatics Laboratory of the Biotechnology Centre at the Silesian University of Technology.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Krzysztof Fujarewicz .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Wilk, A., Gawin, M., Frątczak, K., Widłak, P., Fujarewicz, K. (2020). On Stability of Feature Selection Based on MALDI Mass Spectrometry Imaging Data and Simulated Biopsy. In: Korbicz, J., Maniewski, R., Patan, K., Kowal, M. (eds) Current Trends in Biomedical Engineering and Bioimages Analysis. PCBEE 2019. Advances in Intelligent Systems and Computing, vol 1033. Springer, Cham. https://doi.org/10.1007/978-3-030-29885-2_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-29885-2_8

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-29884-5

  • Online ISBN: 978-3-030-29885-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics