Adding Big Value to Big Businesses: A Present State of the Art of Big Data, Frameworks and Algorithms

Radhika, D.; Aruna Kumari, D.

doi:10.1007/978-981-10-6602-3_17

D. Radhika¹⁷ &
D. Aruna Kumari¹⁸

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 653))

759 Accesses
2 Citations

Abstract

Data plays a pivotal role in business growth. In fact, data is considered to be an asset to organizations. This is more evident in the enterprises where the data is preserved and mined for discovering knowledge. The data with exponential growth and characterized by volume, velocity, and variety is termed as big data. Mining such voluminous data can give comprehensive business intelligence for making strategic decisions. The emergence of cloud computing technology, parallel processing power of servers, and the distributed programming frameworks like Hadoop with new programming paradigm “MapReduce” pave way for mining massive-scale data. Data mining domain is rich in algorithms that are used to mine data for discovering trends. The era of big data has arrived and mining such data is beyond the capability of conventional data mining techniques. The unprecedented exponential growth of data needs a platform for effective data analysis in real time with fast response. In this paper, we present an overview of big data, mechanisms or algorithms and environment or tools needed to execute them. The rationale behind this paper is that big data mining is the need of the hour in all sectors like finance, biology, healthcare, banking, insurance, and environmental research to name few. Review of various aspects of big data mining can help readers to gain know-how in the context of globalization, business collaborations where mining cross-organization data is essential. This paper also throws light into the relationship among big data, cloud computing technology, Hadoop, and Big data storage systems. In future, we intend to propose and implement algorithms for big data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Katal, A., Wazid, M., Goudar, R.H.: Big Data: Issues, Challenges, Tools and Good Practices, pp. 213–313. IEEE, Piscataway (2013)
Google Scholar
Bughin, J., Chui, M., Manyika, J.: Clouds, Big Data, and SmartAssets: Ten Tech-Enabled Business Trends to Watch. McKinSey Quarterly (2010)
Google Scholar
Kaisler, S.: Big Data: Issues and Challenges Moving Forward, pp. 12–17. IEEE, Piscataway (2013)
Google Scholar
Philip Chen, C.L., Zhang, C.-Y.: Data-Intensive Applications, Challenges, Techniques and Technologies: A Survey on Big Data. Elsevier. pp. 32–44 (2014)
Google Scholar
IBM What Is Big Data: Bring Big Data to the Enterprise, http://www-01.ibm.com/software/data/bigdata/, IBM (2012)
Jacobs, A.: The Pathologies of Big Data. Comm. ACM 52(8): 36–44 (2009).
Google Scholar
Madden, S.: From Databases to Big Data, pp. 32–44. IEEE, Piscataway (2013)
Google Scholar
Zheng, Z., Zhu, J., Lyu, M.R.: Service-Generated Big Data and Big Data-as-a-Service: An Overview, pp. 12–17. IEEE, Piscataway (2013)
Google Scholar
Begoli, E., Horey, J.: Design Principles for Effective Knowledge Discovery from Big Data, pp. 12–17. IEEE, Piscataway (2012)
Google Scholar
Kopanas, I., Avouris, N., Daskalaki, S.: The role of domain knowledge in a large scale data mining project. In: Vlahavas, I.P., Spyropoulos, C.D. (eds.) Proceedings of the Second Hellenic Conference AI: Methods and Applications of Artificial Intelligence, pp. 288–299 (2002)
Google Scholar
Luo, D., Ding, C., Huang, H.: Parallelization with multiplicative algorithms for big data mining. In: Proceedings of the IEEE 12th Int’l Conference Data Mining, pp. 489–498 (2012)
Google Scholar
Shafer, J., Agrawal, R., Mehta, M.: SPRINT: a scalable parallel classifier for data mining. In: Proceedings of the 22nd VLDB Conference (1996)
Google Scholar
Rajaraman, A., Ullman, J.: Mining of Massive Data Sets. Cambridge University Press, Cambridge (2011)
Google Scholar
Lorch, J., Parno, B., Mickens, J., Raykova, M., Schiffman, J.: Shoroud: ensuring private access to large-scale data in the data center. In: Proceedings of the 11th USENIX Conference File and Storage Technologies (FAST ’13) (2013)
Google Scholar
Schadt, E.: The Changing Privacy Landscape in the Era of Big Data. Mol. Syst. 8, article 612 (2012)
Google Scholar
Machanavajjhala, A., Reiter, J.P.: Big privacy: protecting confidentiality in big data. ACM Crossroads 19(1), 20–23 (2012)
Article Google Scholar
Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The Rise of “Big Data” on Cloud Computing: Review and Open Research Issues, vol. 47, no. 1, pp. 98–115. Elsevier, Amsterdam (2015)
Google Scholar
Papadimitriou, S., Sun, J.: Disco: distributed co-clustering with map-reduce: a case study towards petabyte-scale end-to-end mining. In: Proceedings of the IEEE Eighth Int’l Conference Data Mining (ICDM ’08), pp. 512–521 (2008)
Google Scholar
Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating MapReduce for multi-core and multiprocessor systems. In: Proceedings of the IEEE 13th Int’l Symposium High Performance Computer Architecture (HPCA ’07), pp. 13–24 (2007)
Google Scholar
Wegener, D., Mock, M., Adranale, D., Wrobel, S.: Toolkit-based high-performance data mining of large data on MapReduce clusters. In: Proceedings of the Int’l Conference Data Mining Workshops (ICDMW ’09), pp. 296–301 (2009)
Google Scholar
Zhu, X., Zhang, P., Lin, X., Shi, Y.: Active learning from stream data using optimal weight classifier ensemble. IEEE Trans. Syst. Man Cybern. Part B 40(6), 1607–1621 (2010)
Article Google Scholar
Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: HaLoop: Efficient Iterative Data Processing on Large Clusters, pp. 1–12. IEEE, Piscataway (2010)
Google Scholar
Rao, S., Ramakrishnan, R., Silberstein, A.: Sailfish: A Framework For Large Scale Data Processing, pp. 1–14. Microsoft, USA (2012)
Google Scholar
Eui-Hong (Sam) Han, George Karypis, Member, IEEE, and Vipin Kumar, Fellow, IEEE: Scalable Parallel Data Mining for Association Rules, vol. 12, no. 3, pp. 25–34. IEEE, Piscataway (2000)
Google Scholar
Azzini, A., Ceravolo, P.: Consistent Process Mining Over Big Data Triple Stores, pp. 25–34. IEEE, Piscataway (2013)
Google Scholar
Hoi, S.C.H., Wang, J., Zhao, P., Jin, R.: Online Feature Selection For Mining Big Data, pp. 12–17. ACM, New York (2012)
Google Scholar
Rakthanmano, T.: Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping, pp. 23–33. ACM, New York (2013)
Google Scholar
Laptev, N., Zeng, K., Zaniolo, C.: Very Fast Estimation for Result and Accuracy of Big Data Analytics: The EARL System, pp. 23–33. Springer, Heidelberg (2013)
Google Scholar
Zhang, Y.: A Fast Online Learning Algorithm for Distributed Mining of BigData, pp. 213–313. IEEE, Piscataway (2012)
Google Scholar
Yadav, C., Wang, S., Kumar, M.: Algorithm and approaches to handle data—a survey. International Journal of Computer Science and Network 2(3), 12–17 (2013)
Google Scholar
Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. IEEE 14(2), 25–34 (1998)
Google Scholar
Wu, X., Zhu, X., Wu, G.-Q.: Data mining with big data. IEEE 26(1), 97–107 (2014)
Google Scholar
Chang, E.Y., Bai, H., Zhu, K.: Parallel algorithms for mining large-scale rich-media data. In: Proceedings of the 17th ACM Int’l Conference Multimedia (MM ’09,) pp. 917–918 (2009)
Google Scholar
Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)
Article Google Scholar
Chen, Y.-C., Peng, W.-C., Lee, S.-Y.: Efficient algorithms for influence maximization in social networks. Knowl. Inf. Syst. 33(3), 577–601 (2012)
Article Google Scholar
Zhao, J., Wu, J., Feng, X., Xiong, H., Xu, K.: Information propagation in online social networks: a tie-strength perspective. Knowl. Inf. Syst. 32(3), 589–608 (2012)
Article Google Scholar
Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: Proceedings of the 20th Annual Conference Neural Information Processing Systems (NIPS ’06), pp. 281–288 (2006)
Google Scholar
Howe, D., et al.: Big data: the future of biocuration. Nature 455, 47–50 (2008)
Article Google Scholar
Huberman, B.: Sociology of science: big data deserve a bigger audience. Nature 482, 308 (2012)
Article Google Scholar
Wu, X., Zhu, X., Wu, G.-Q., Ding, W.: Data Mining with Big Data, pp. 23–33. IEEE, Piscataway (2013)
Google Scholar
Mervis, J.: U.S. science policy: agencies rally to tackle big data. Science 336(6077), 22 (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Engineering, K L University, Guntur, AP, India
D. Radhika
Department ECM, K L University, Guntur, AP, India
D. Aruna Kumari

Authors

D. Radhika
View author publications
You can also search for this author in PubMed Google Scholar
D. Aruna Kumari
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to D. Radhika .

Editor information

Editors and Affiliations

USMS, GGSIP University, New Delhi, Delhi, India
A. K. Saini
Indian Institute of Business Management, Patna, Bihar, India
A. K. Nayak
Institute of Life Long Learning (ILLL), New Delhi, Delhi, India
Ram Krishna Vyas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Radhika, D., Aruna Kumari, D. (2018). Adding Big Value to Big Businesses: A Present State of the Art of Big Data, Frameworks and Algorithms. In: Saini, A., Nayak, A., Vyas, R. (eds) ICT Based Innovations. Advances in Intelligent Systems and Computing, vol 653. Springer, Singapore. https://doi.org/10.1007/978-981-10-6602-3_17

Download citation

DOI: https://doi.org/10.1007/978-981-10-6602-3_17
Published: 01 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-6601-6
Online ISBN: 978-981-10-6602-3
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics