Entropy and sigmoid based K-means clustering and AGWO for effective big data handling

Vankdothu, Ramdas; Hameed, Mohd Abdul; Bhukya, Raju; Garg, Gaurav

doi:10.1007/s11042-022-13929-2

Entropy and sigmoid based K-means clustering and AGWO for effective big data handling

Published: 03 October 2022

Volume 82, pages 15287–15304, (2023)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Ramdas Vankdothu ORCID: orcid.org/0000-0002-8478-1291¹,
Mohd Abdul Hameed¹,
Raju Bhukya² &
…
Gaurav Garg³

165 Accesses
2 Citations
Explore all metrics

This article has been updated

Abstract

In this article, we have presented the effective handling of big data using adaptive clustering and optimization techniques. Initially, heterogeneous data is collected from multiple sources and then transformed the data into desired network graphs. Then finding patterns in the graphs, the module distributes the data into the right data blocks using Entropy and sigmoid based K-means clustering. Subsequently, an adaptive grey wolf optimization (AGWO) algorithm in Hadoop distributed file system (HDFS) distributes the data blocks into the right machine. This optimized HDFS serves as a data source for services to execute queries and provide a platform to apply graph algorithms efficiently as well as reduce resource usage. Finally, we can handle a broad range of data types, query time, and resource usage. The experimental results of the proposed work provide better results in comparison with the existing methods such as GWO and PSOin terms of the algorithm run Time, loading time, resource usage, Query time, Query execution time and convergence.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A hybrid MapReduce-based k-means clustering using genetic algorithm for distributed datasets

Article 11 November 2017

Big Data Analysis Using Hybrid Meta-Heuristic Optimization Algorithm and MapReduce Framework

An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique

Article 29 January 2020

Data availability

Data sharing not applicable to this article as no datasets were generated or analyzed during the current study’.

Change history

20 October 2022
Dr Gaurav Garg's affiliation has been updated. 'Himachal Pradesh' has been inserted after 'Baddi'.

References

Abualigah L, Gandomi AH, Elaziz MA, Al Hamad H, Omari M, Alshinwan M, Khasawneh AM (2021) Advances in Meta-Heuristic Optimization Algorithms in Big Data Text Cluster-ing. Electronics 10(2):101
Article Google Scholar
Acharjya DP, Ahmed K (2016) A survey on big data analytics: challenges, open research issues, and tools. Int J Adv Comput Sci Appl 7(2):511–518
Google Scholar
Alzyadat WJ, AlHroob A, Almukahel IH, Atan R (2019) Fuzzy map approach for accruing velocity of big data. Compusoft 8(4):3112–3116
Google Scholar
Anagnostopoulos I, Zeadally S, Exposito E (2016) Handling big data: research challenges and future directions. J Supercomput 72(4):1494–1516
Article Google Scholar
Azzedin F, Ghaleb M (2019) Towards an Architecture for Handling Big Data in Oil and Gas Industries: Service-Oriented Approach. (IJACSA) Int J Adv Comput Sci Appl 10(2). https://doi.org/10.14569/IJACSA.2019.0100269
Berahmand K, Mohammadi M, Faroughi A, Mohammadiani RP (2022) A novel method of spectral clustering in attributed networks by constructing parameter-free affinity matrix. Clust Comput 25:869–888
Article Google Scholar
Berahmand K, Haghani S, Rostami M, Lia Y (2022) A new Attributed Graph Clustering by using Label Propagation in Complex Networks. J King Saud Univ Comput Inf Sci 34:1869–1883
Google Scholar
Berahmanda K, Nasirib E, Mohammadianic RP, Yuefeng L (2021) Spectral clustering on protein interaction networks via constructing affinity matrix graph embedding. Comput Biol Med J 138:104933
Article Google Scholar
Bezdek JC (2013) Pattern recognition with fuzzy objective function algorithms. Springer Science & Business Media
MATH Google Scholar
Bharill N, Tiwari A (2014) Handling big data with fuzzy based classification approach. In: Advance Trends in Soft Computing. Springer, Cham, pp 219–227
Chapter Google Scholar
Bharill N, Tiwari A, Malviya A (2016) Fuzzy based scalable clustering algorithms for handling big data using apache spark. IEEE Trans Big Data 2(4):339–352
Article Google Scholar
Casado R, Younas M (2015) Emerging trends and technologies in the big data processing. Concurr Comput: Practice and Experience 27(8):2078–2091
Article Google Scholar
Chen CLP, Zhang C-Y (2014) Data-intensive applications, challenges, techniques, and technologies: a survey on big data. Inf Sci 275:314–347
Article Google Scholar
Chi M, Plaza A, Benediktsson JA, Sun Z, Shen J, Zhu Y (2016) Big data for remote sensing: Challenges and opportunities. Proc IEEE 104(11):2207–2219
Article Google Scholar
Chowdhury K, Chaudhuri D, Pal AK (2020) An entropy-based initialization method of K-means clustering on the optimal number of clusters. Neural Comput & Applic 33:6965–6982
Article Google Scholar
Hajeer M, Dasgupta D (2017) Handling big data using a data-aware hdfs and evolutionary clustering technique. IEEE Trans Big Data 5(2):134–147
Article Google Scholar
Havens TC, Bezdek JC, Leckie C, Hall LO, Palaniswami M (2012) Fuzzy c-means algorithms for very large data. IEEE Trans Fuzzy Syst 20(6):1130–1146
Hidri MS, Zoghlami MA, Ayed RB Speeding up the large-scale consensus fuzzy clustering for handling Big Data. Fuzzy Sets Syst 348(2018):50–74
Huang J, Abadi DJ, Ren K (2011) Scalable SPARQL querying of large RDF graphs. Proc VLDB Endowment 4(11):1123–1134
Jin X, Wah BW, Cheng X, Wang Y (2015) Significance and challenges of big data research. Big Data Research 2(2):59–64
Article Google Scholar
Khan N, Yaqoob I, Hashem IAT, Inayat Z, Ali M, Kamaleldin W, Alam M, Shiraz M, Gani A (2014) Big data: survey, technologies, opportunities, and challenges. Sci World J 2014:1–18
Google Scholar
Ramírez-Gallego S, Fernández A, García S, Chen M, Herrera F (2018) Big data: tutorial and guidelines on information and process fusion for analytics algorithms with MapReduce. Inf Fusion 42:51–61
Article Google Scholar
Rodriguez SIR, de Assis Tenorio de Carvalho F (2021) Fuzzy clustering algorithms with distance metric learning and entropy regularization. arXiv preprint arXiv: 2102.09529
Rohloff K, Schantz RE (2011, June) Clause-iteration with MapReduce to scalably query data graphs in the SHARD graph-store. In: Proceedings of the fourth international workshop on data-intensive distributed computing. pp. 35-44
Shekhar H, Sharma M (n.d.) A Framework for Big Data Analytics as a Scalable Systems. In: Special Conference Issue: National Conference on Cloud Computing and Big Data, IJANA, pp. 72–82
Shukla S, Kukade V, Mujawar S (2015) Big data: concept, handling and challenges: an overview. Int J Comput Appl 114(11):6–9
Google Scholar
Singh DK, Patgiri R (2016) Big graph: Tools, techniques, issues, challenges and future directions. In: 6th Int. Conf. on Advances in Computing and Information Technology (ACITY 2016), Chennai, India, pp. 119–128
Yang C, Huang Q, Li Z, Liu K, Fei H (2017) Big data and cloud computing: innovation opportunities and challenges. Int J Digit Earth 10(1):13–53
Article Google Scholar
Zachary WW (1977) An information flow model for conflict and fission in small groups. J Anthropol Res 33(4):452–473
Article Google Scholar
Zeng G (2015) Research on privacy protection in big data environment. Int J Eng Res Appl:46–50
Zhen C (2021) Using big data fuzzy K-means clustering and information fusion algorithm in English teaching ability evaluation. Complexity 2021:1–9
Google Scholar
Zhu L, Yu FR, Wang Y, Ning B, Tang T (2018) Big data analytics in intelligent transportation systems: a survey. IEEE Trans Intell Transp Syst 20(1):383–398
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science & Engineering, Osmania University Hyderabad, Hyderabad, India
Ramdas Vankdothu & Mohd Abdul Hameed
Department of Computer Science & Engineering, National Institute of Technology, Trichy, India
Raju Bhukya
School of Engineering and Technology-Computer Science and Engineering, Chitkara University, Baddi, Himachal Pradesh, India
Gaurav Garg

Authors

Ramdas Vankdothu
View author publications
You can also search for this author in PubMed Google Scholar
Mohd Abdul Hameed
View author publications
You can also search for this author in PubMed Google Scholar
Raju Bhukya
View author publications
You can also search for this author in PubMed Google Scholar
Gaurav Garg
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ramdas Vankdothu.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Informed consent

There is no Informed Consent.

Ethical approval

This paper is not Ethical Approval. This article does not contain any studies with human participants performed by any of the authors.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Vankdothu, R., Hameed, M.A., Bhukya, R. et al. Entropy and sigmoid based K-means clustering and AGWO for effective big data handling. Multimed Tools Appl 82, 15287–15304 (2023). https://doi.org/10.1007/s11042-022-13929-2

Download citation

Received: 29 December 2021
Revised: 03 August 2022
Accepted: 12 September 2022
Published: 03 October 2022
Issue Date: April 2023
DOI: https://doi.org/10.1007/s11042-022-13929-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Entropy and sigmoid based K-means clustering and AGWO for effective big data handling

Abstract

Access this article

Similar content being viewed by others

A hybrid MapReduce-based k-means clustering using genetic algorithm for distributed datasets

Big Data Analysis Using Hybrid Meta-Heuristic Optimization Algorithm and MapReduce Framework

An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique

Data availability

Change history

20 October 2022

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed consent

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Entropy and sigmoid based K-means clustering and AGWO for effective big data handling

Abstract

Access this article

Similar content being viewed by others

A hybrid MapReduce-based k-means clustering using genetic algorithm for distributed datasets

Big Data Analysis Using Hybrid Meta-Heuristic Optimization Algorithm and MapReduce Framework

An improved query optimization process in big data using ACO-GA algorithm and HDFS map reduce technique

Data availability

Change history

20 October 2022

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Informed consent

Ethical approval

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation