Abstract
The contemporary soil analytical database processing techniques lack in optimization of databases and tables on storage grids, and are limited to the single instance of database transactions in handling large volumes of soil analytical data sets. Unfortunately, these scenarios increase data processing overheads in private agricultural cloud services. In this paper, we propose a Predictive Scalability Generator (PS-Gen) technique to optimize and partition the large data sets by creating indexed databases and tables dynamically on clustered storage grids. This intelligence is conceptualized by studying the k-means clustering algorithm which is highly used in cloud database processing systems. Our approach allows the creation of database and tables dynamically within a cluster by monitoring the cluster balancer defined in the system to handle large datasets. Alternatively, the proposed approach enables quick and dynamic movement of database and tables within clusters to manage load actively by performing the row-specific tuple management. This is achieved by integrating the horizontal sharding technique to our proposed method. The evaluated experimental results exhibit the effective management of large agricultural data in private cloud systems by effective load balancing across clusters. Further, the proposed approach is flexible for adopting network subsystems and to develop an efficient cloud-based application system.
Similar content being viewed by others
References
Basavaraja PK, Mohamed SH, Dey P, Nethradhani RCR (2017) Geo-reference based soil fertility status in Tumkur district of Karnataka, India. Environ Ecol 35(1):93–101
Andrienko N, Andrienko G (2011) Spatial generalization and aggregation of massive movement data. IEEE Trans Vis Comput Graph 17(2):205–219
Tahir N, Khan MJ, Ayaz M, Ali M, Fatima A, Ayesha SAB (2016) Analysis of soil fertility and mapping using geostatistical information system. Pure Appl Biol 5(3):446–452
Zhu Y, Di W, Li S (2013) Cloud computing and agricultural development of China: theory and practice. IJCSI Int J Comput Sci Issues 10(1):7–12
Sadooghi I, Palur S, Anthony A, Kapur I, Ramamurty K, Wang K, Raicu I (2014) Achieving efficient distributed scheduling with message queues in the cloud for many-task computing and high-performance computing. In: Proc 14th IEEE/ACM Int Symp Cluster, Cloud Grid Comput, pp 404–413
Ramakrishnan L, Canon RS, Muriki K, Sakrejda I, Wright NJ (2012) Evaluating Interconnect and virtualization performance for high performance computing. ACM Perform Eval Rev 40:55–60
Ramesh V, Ramar K, Babu S (2013) Parallel K-means algorithm on agricultural databases. IJCSI Int J Comput Sci 10(1):710
Guide to Scaling Web Databases with MySQL Cluster, A MySQL® White Paper https://www.mysql.com/products/cluster/scalability.html (2011). Accessed 20 Jan 2019
Patel MP, Hasan MI, Vasava HD (2014) Performance improvement of sharding in MongoDB using k-mean clustering algorithm. Int J Adv Eng Res Dev (IJAERD) 1(5):1–5
Jackson K, Ramakrishnan L, Muriki K, Canon S, Cholia S, Shalf J, Wasserman H, Wright N (2010) Performance analysis of high performance computing applications on the Amazon web services cloud. In: Proc 2nd IEEE Int Conf Cloud Comput Technol Sci, pp 159–168
Saraswati M, Chandra SA (2016) An efficient method of partitioning high volumes of multidimensional data for parallel clustering algorithms. Int J Eng Res Appl 6(8 Part–5):67–71
Chaudhari Chaitali G (2012) Optimizing clustering technique based on partitioning DBSCAN and ant clustering algorithm. Int J Eng Adv Technol (IJEAT) 2(2):2249–8958
Wang J, Korambath P, Altintas I, Davis J, Crawl D (2014) Workflow as a service in the cloud: architecture and scheduling algorithms. Proc Comput Sci 29:546–556
Herodotou H, Borisov N, Babu S (2011) Query optimization techniques for partitioned tables. In: SIGMOD’11, June 12–16, Athens, Greece
Jain S, Barwal PN (2014) Performance analysis of optimization techniques for SQL multi query expressions over text databases in RDBMS. Int J Inf Comput Technol 4(8):841–852
Nisha S, Lakshmipathi B (2012) Optimization of horizontal aggregation in SQL by using K-Means clustering. Int J Adv Res Comput Sci Softw Eng 2(5):203–208
Atabay HA, Sheikhzadeh MJ, Torshizi M (2016) Clustering Algorithm based on PSO and k-means to find optimal cluster centroids. In: IEEE international conference on swarm intelligence and evolutionary computation (CSIEC)
Adam OY, Lee YC, Zomaya AY (2016) Constructing performance-predictable clusters with performance-varying resources of clouds. IEEE Trans Comput 65(9):2709–2724
Ordonez C (2006) Integrating K-means clustering with a relational DBMS using SQL. Trans Knowl Data Eng (TKDE J) 18(2):188–201
Khandare A, Alvi A (2018) Efficient clustering algorithm with enhanced cohesive quality clusters. Int J Intell Syst Appl 7:48–57
Sharma S, Goel M, Kaur P (2013) Performance comparison of various robust data clustering algorithms. Int J Intell Syst Appl 5(7):63–71
Papadomanolakis S, Ailamaki A (2004) AutoPart: automating schema design for large scientific databases using data partitioning. In: Proceedings of the 16th international conference on scientific and statistical database management, pp 1099–3371
Khan M, Khan MNA (2013) Exploring query optimization techniques in relational databases. Int J Database Theory Appl 6(3):11–20
Pradeep Kumar V, Krishnaiah RV (2012) Horizontal aggregation in SQL to prepare data sets for data mining analysis. IOSR J Comput Eng (IOSRJCE) 6(5):36–41
Kozlovszky M, Karoczkai K, Marton I, Balasko A, Marosi AC, Kacsuk P (2012) Enabling generic distributed computing infrastructure compatibility for workflow management systems. Comput Sci 13(3):61
Li D, Han L, Ding Y (2010) SQL query optimization methods of relation database system. In: Computer engineering and applications (ICCEA)
Acknowledgements
This work is supported and funded by the team of AICRP on STCR (Soil Test Crop Response), University of Agricultural Sciences (UAS), Gandhi Krishi Vignana Kendra (GKVK), Bangalore, Karnataka, India It is also financially supported by New Age Incubation Network (NAIN) ICT Skill Development Society, Department of IT, BT and S & T, Ref No: ICTSDS/CEO/17/2014-15, Govt. of Karnataka, and Vision Group on Science and Technology (VGST) scheme of RFTT, Govt. of Karnataka, Ref No: KSTePS/VGST-RFTT/2016-17/279/6.
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Leena, H.U., Premasudha, B.G. & Basavaraja, P.K. Data optimisation and partitioning in private cloud using dynamic clusters for agricultural datasets. Int. J. Dynam. Control 8, 1027–1039 (2020). https://doi.org/10.1007/s40435-019-00596-9
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s40435-019-00596-9