Skip to main content

Hybrid Method for Cluster Analysis of Big Data

  • Conference paper
  • First Online:
Intelligent Computing Techniques for Smart Energy Systems

Part of the book series: Lecture Notes in Electrical Engineering ((LNEE,volume 607))

  • 1664 Accesses

Abstract

In big data analytics, deep interest in a communication known as computer-mediated has cropped up. While using traditional techniques, it is difficult to handle the data which is magnanimous. Hence, there exists a need for improved methods to handle this data since the past methods do not fit properly in all kinds of situations. Normally, there are various steps for the handling of big data like acquisition, preprocessing, and processing and analysis of this data in order to retrieve proper semantics out of that amount of data. In a similar context, clustering has evolved as a popular approach for organizing and analysis of big data. In the present research work, a hybrid method for analysis of big data is proposed. The hybrid approach consists of the blending of K-means, Ward hierarchical along with the interpolation technique. The evaluation of and validation of the proposed approach has been carried out for the city dataset in R language. In the present work, the number of clusters and the size of the data get varied while carrying out the results. The results of the proposed work reflect impressive execution times of the proposed method over the existing ones. The proposed method also presents possible recommendation for extracting specific semantics for providing insights to business recommendations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 299.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 379.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 379.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Priya V, Vadivel A (2012) User behaviour pattern mining from weblog. Int J Data Warehous Min (IJDWM) 8(2):1–22

    Article  Google Scholar 

  2. Tamayo P, Slonim D, Mesirov J, Zhu Q, Kitareewan S, Dmitrovsky E et al (1999) Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. Proc Natl Acad Sci 96(6):2907–2912

    Article  Google Scholar 

  3. Bar-Joseph Z (2004) Analyzing time series gene expression data. Bioinformatics 20(16):2493–2503

    Article  Google Scholar 

  4. Androulakis IP, Yang E, Almon RR (2007) Analysis of time-series gene expression data: methods, challenges, and opportunities. Annu Rev Biomed Eng 9:205–228

    Article  Google Scholar 

  5. Dabas C (2017) Big data analytics for exploratory social network analysis. Int J Inf Technol Manag 16(4):348–359

    MathSciNet  Google Scholar 

  6. de Ridder D, de Ridder J, Reinders MJ (2013) Pattern recognition in bioinformatics. Brief Bioinform 14(5):633–647

    Article  Google Scholar 

  7. Bughin J (2016) Big data, big bang? J Big Data 3(1):2

    Article  Google Scholar 

  8. Tsai CW, Lai CF, Chao HC, Vasilakos AV (2015) Big data analytics: a survey. J Big Data 2(1):21

    Article  Google Scholar 

  9. Silverman BW (2018) Density estimation for statistics and data analysis. Routledge

    Google Scholar 

  10. Li CT, Yuan Y, Wilson R (2008) An unsupervised conditional random fields approach for clustering gene expression time series. Bioinformatics 24(21):2467–2473

    Article  Google Scholar 

  11. Luan Y, Li H (2003) Clustering of time-course gene expression data using a mixed-effects model with B-splines. Bioinformatics 19(4):474–482

    Article  Google Scholar 

  12. Medvedovic M, Yeung KY, Bumgarner RE (2004) Bayesian mixture model based clustering of replicated microarray data. Bioinformatics 20(8):1222–1232

    Article  Google Scholar 

  13. Schliep A, Costa IG, Steinhoff C, Schonhuth A (2005) Analyzing gene expression time-courses. IEEE/ACM Trans Comput Biol Bioinf (TCBB) 2(3):179–193

    Article  Google Scholar 

  14. Yeung KY, Medvedovic M, Bumgarner RE (2003) Clustering gene-expression data with repeated measurements. Genome Biol 4(5):R34

    Article  Google Scholar 

  15. Jain AK (2010) Data clustering: 50 years beyond K-means. Pattern Recogn Lett 31(8):651–666

    Article  Google Scholar 

  16. Ward JH Jr (1963) Hierarchical grouping to optimize an objective function. J Am Stat Assoc 58(301):236–244

    Article  MathSciNet  Google Scholar 

  17. Andersson A, Mattsson C (1993) Dynamic interpolation search in o (log log n) time. Automata, Languages and Programming, pp 15–27

    MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gaurav Kumar Nigam .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Dabas, C., Nigam, G.K. (2020). Hybrid Method for Cluster Analysis of Big Data. In: Kalam, A., Niazi, K., Soni, A., Siddiqui, S., Mundra, A. (eds) Intelligent Computing Techniques for Smart Energy Systems. Lecture Notes in Electrical Engineering, vol 607. Springer, Singapore. https://doi.org/10.1007/978-981-15-0214-9_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-15-0214-9_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-15-0213-2

  • Online ISBN: 978-981-15-0214-9

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics