Skip to main content
Log in

Effective cancer subtyping by employing density peaks clustering by using gene expression microarray

  • Original Article
  • Published:
Personal and Ubiquitous Computing Aims and scope Submit manuscript

Abstract

Discovering the similar groups is a popular primary step in analysis of biomedical data, which cannot be identified manually. Many supervised and unsupervised machine learning and statistical approaches have been developed to solve this problem. Clustering is an unsupervised learning approach, which organizes the data into similar groups, and is used to discover the intrinsic hidden structure of data. In this paper, we used clustering by fast search and find of density peaks (CDP) approach for cancer subtyping and identification of normal tissues from tumor tissues. In additional, we also address the preprocessing and underlying distance matrix’s impact on finalized groups. We have performed extensive experiments on real-world and synthetic cancer gene expression microarray data sets and compared obtained results with state-of-the-art clustering approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  1. Ronan T, Qi Z, Naegle KM (2016) Avoiding common pitfalls when clustering biological data. Sci Signal 9:re6

    Article  Google Scholar 

  2. Zhuge H, Sun Y, (2010) The schema theory for semantic link network. Future Generation Computer Systems 26 (3):408-420

  3. Mehmood R, Zhang G, Bie R, Dawood H, Ahmad H (2016 Oct 5) Clustering by fast search and find of density peaks via heat diffusion. Neurocomputing 208:210–217. https://doi.org/10.1016/j.neucom.2016.01.102

    Article  Google Scholar 

  4. Bie R, Mehmood R, Ruan S, Sun Y, Dawood H, (2016) Adaptive fuzzy clustering by fast search and find of density peaks. Personal and Ubiquitous Computing 20 (5):785-793

  5. Cai Z, Goebel R, Salavatipour M, Lin G (2007) Selecting dissimilar genes for multi-class classification, an application in cancer subtyping. BMC Bioinformatics. 8:206.

  6. Wiwie C, Baumbach J, Röttger R (2015) Comparing the performance of biomedical clustering methods. Nat Methods 12(11):1033–1038. https://doi.org/10.1038/nmeth.3583

    Article  Google Scholar 

  7. Cai Z, Heydari M, Lin G (2006) Iterated local least squares microarray missing value imputation. Journal of Bioinformatics and Computational Biology 04 (05):935-957

  8. Yang K, Cai Z, Li J, Lin G (2006) A Stable Gene Selection in Microarray Data Analysis. BMC Bioinformatics. 7:228.

  9. Monti S, Tamayo P, Mesirov J, Golub T (2003) Consensus clustering: a resampling-based method for class discovery and visualization of gene expression microarray data. Mach Learn 52(1/2):91–118. https://doi.org/10.1023/A:1023949509487

    Article  MATH  Google Scholar 

  10. Johnson SC (1967) Hierarchical clustering schemes. Psychometrika 32(3):241–254. https://doi.org/10.1007/BF02289588

    Article  MATH  Google Scholar 

  11. MacQueen J (1967) Some methods for classification and analysis of multivariate observations. Proceedings of the fifth Berkeley symposium on mathematical statistics and probability 1(14):281–297

  12. Kohonen T (1998) The self-organizing map. Neurocomputing 21(1):1–6. https://doi.org/10.1016/S0925-2312(98)00030-7

    Article  MathSciNet  MATH  Google Scholar 

  13. Rodriguez A, Laio A (2014) Clustering by fast search and find of density peaks. Science 344(6191):1492–1496. https://doi.org/10.1126/science.1242072

    Article  Google Scholar 

  14. Krivtsov AV, Twomey D, Feng Z, Stubbs MC, Wang Y, Faber J, Levine JE, Wang J, Hahn WC, Gilliland DG, Golub TR, Armstrong SA (2006) Transformation from committed progenitor to leukaemia stem cell initiated by mll–af9. Nature 442(7104):818–822. https://doi.org/10.1038/nature04980

    Article  Google Scholar 

  15. Chang JC, Wooten EC, Tsimelzon A, Hilsenbeck SG et al (2005) Patterns of resistance and incomplete response to docetaxel by gene expression profiling in breast cancer patients. J Clin Oncol 23(6):1169–1177

    Article  Google Scholar 

  16. Jain A, Nandakumar K, Ross A (2005 Dec 31) Score normalization in multimodal biometric systems. Pattern Recogn 38(12):2270–2285. https://doi.org/10.1016/j.patcog.2005.01.012

    Article  Google Scholar 

  17. Volinia S, Calin GA, Liu CG, Ambs S, Cimmino A, Petrocca F, Visone R, Iorio M, Roldo C, Ferracin M, Prueitt RL (2006) A microRNA expression signature of human solid tumors defines cancer gene targets. Proc Natl Acad Sci U S A 103(7):2257–2261. https://doi.org/10.1073/pnas.0510565103

    Article  Google Scholar 

  18. Mehmood R, El-Ashram S, Bie R, Dawood H, Kos A (2017) Clustering by fast search and merge of local density peaks for gene expression microarray data. Scientific Reports 7:45602

Download references

Funding

This research is sponsored by the National Natural Science Foundation of China (No. 61571049, 61371185, 61401029, 61472044, and 61472403) and the Fundamental Research Funds for the Central Universities (No. 2014KJJCB32 and 2013NT57) and by SRF for ROCS, SEM.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yunchuan Sun.

Ethics declarations

Conflict of interest

The authors declare that they have no competing interests.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Mehmood, R., El-Ashram, S., Bie, R. et al. Effective cancer subtyping by employing density peaks clustering by using gene expression microarray. Pers Ubiquit Comput 22, 615–619 (2018). https://doi.org/10.1007/s00779-018-1112-y

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00779-018-1112-y

Keywords

Navigation