Skip to main content

Adding Big Value to Big Businesses: A Present State of the Art of Big Data, Frameworks and Algorithms

  • Conference paper
  • First Online:
ICT Based Innovations

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 653))

Abstract

Data plays a pivotal role in business growth. In fact, data is considered to be an asset to organizations. This is more evident in the enterprises where the data is preserved and mined for discovering knowledge. The data with exponential growth and characterized by volume, velocity, and variety is termed as big data. Mining such voluminous data can give comprehensive business intelligence for making strategic decisions. The emergence of cloud computing technology, parallel processing power of servers, and the distributed programming frameworks like Hadoop with new programming paradigm “MapReduce” pave way for mining massive-scale data. Data mining domain is rich in algorithms that are used to mine data for discovering trends. The era of big data has arrived and mining such data is beyond the capability of conventional data mining techniques. The unprecedented exponential growth of data needs a platform for effective data analysis in real time with fast response. In this paper, we present an overview of big data, mechanisms or algorithms and environment or tools needed to execute them. The rationale behind this paper is that big data mining is the need of the hour in all sectors like finance, biology, healthcare, banking, insurance, and environmental research to name few. Review of various aspects of big data mining can help readers to gain know-how in the context of globalization, business collaborations where mining cross-organization data is essential. This paper also throws light into the relationship among big data, cloud computing technology, Hadoop, and Big data storage systems. In future, we intend to propose and implement algorithms for big data mining.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Katal, A., Wazid, M., Goudar, R.H.: Big Data: Issues, Challenges, Tools and Good Practices, pp. 213–313. IEEE, Piscataway (2013)

    Google Scholar 

  2. Bughin, J., Chui, M., Manyika, J.: Clouds, Big Data, and SmartAssets: Ten Tech-Enabled Business Trends to Watch. McKinSey Quarterly (2010)

    Google Scholar 

  3. Kaisler, S.: Big Data: Issues and Challenges Moving Forward, pp. 12–17. IEEE, Piscataway (2013)

    Google Scholar 

  4. Philip Chen, C.L., Zhang, C.-Y.: Data-Intensive Applications, Challenges, Techniques and Technologies: A Survey on Big Data. Elsevier. pp. 32–44 (2014)

    Google Scholar 

  5. IBM What Is Big Data: Bring Big Data to the Enterprise, http://www-01.ibm.com/software/data/bigdata/, IBM (2012)

  6. Jacobs, A.: The Pathologies of Big Data. Comm. ACM 52(8): 36–44 (2009).

    Google Scholar 

  7. Madden, S.: From Databases to Big Data, pp. 32–44. IEEE, Piscataway (2013)

    Google Scholar 

  8. Zheng, Z., Zhu, J., Lyu, M.R.: Service-Generated Big Data and Big Data-as-a-Service: An Overview, pp. 12–17. IEEE, Piscataway (2013)

    Google Scholar 

  9. Begoli, E., Horey, J.: Design Principles for Effective Knowledge Discovery from Big Data, pp. 12–17. IEEE, Piscataway (2012)

    Google Scholar 

  10. Kopanas, I., Avouris, N., Daskalaki, S.: The role of domain knowledge in a large scale data mining project. In: Vlahavas, I.P., Spyropoulos, C.D. (eds.) Proceedings of the Second Hellenic Conference AI: Methods and Applications of Artificial Intelligence, pp. 288–299 (2002)

    Google Scholar 

  11. Luo, D., Ding, C., Huang, H.: Parallelization with multiplicative algorithms for big data mining. In: Proceedings of the IEEE 12th Int’l Conference Data Mining, pp. 489–498 (2012)

    Google Scholar 

  12. Shafer, J., Agrawal, R., Mehta, M.: SPRINT: a scalable parallel classifier for data mining. In: Proceedings of the 22nd VLDB Conference (1996)

    Google Scholar 

  13. Rajaraman, A., Ullman, J.: Mining of Massive Data Sets. Cambridge University Press, Cambridge (2011)

    Google Scholar 

  14. Lorch, J., Parno, B., Mickens, J., Raykova, M., Schiffman, J.: Shoroud: ensuring private access to large-scale data in the data center. In: Proceedings of the 11th USENIX Conference File and Storage Technologies (FAST ’13) (2013)

    Google Scholar 

  15. Schadt, E.: The Changing Privacy Landscape in the Era of Big Data. Mol. Syst. 8, article 612 (2012)

    Google Scholar 

  16. Machanavajjhala, A., Reiter, J.P.: Big privacy: protecting confidentiality in big data. ACM Crossroads 19(1), 20–23 (2012)

    Article  Google Scholar 

  17. Hashem, I.A.T., Yaqoob, I., Anuar, N.B., Mokhtar, S., Gani, A., Khan, S.U.: The Rise of “Big Data” on Cloud Computing: Review and Open Research Issues, vol. 47, no. 1, pp. 98–115. Elsevier, Amsterdam (2015)

    Google Scholar 

  18. Papadimitriou, S., Sun, J.: Disco: distributed co-clustering with map-reduce: a case study towards petabyte-scale end-to-end mining. In: Proceedings of the IEEE Eighth Int’l Conference Data Mining (ICDM ’08), pp. 512–521 (2008)

    Google Scholar 

  19. Ranger, C., Raghuraman, R., Penmetsa, A., Bradski, G., Kozyrakis, C.: Evaluating MapReduce for multi-core and multiprocessor systems. In: Proceedings of the IEEE 13th Int’l Symposium High Performance Computer Architecture (HPCA ’07), pp. 13–24 (2007)

    Google Scholar 

  20. Wegener, D., Mock, M., Adranale, D., Wrobel, S.: Toolkit-based high-performance data mining of large data on MapReduce clusters. In: Proceedings of the Int’l Conference Data Mining Workshops (ICDMW ’09), pp. 296–301 (2009)

    Google Scholar 

  21. Zhu, X., Zhang, P., Lin, X., Shi, Y.: Active learning from stream data using optimal weight classifier ensemble. IEEE Trans. Syst. Man Cybern. Part B 40(6), 1607–1621 (2010)

    Article  Google Scholar 

  22. Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: HaLoop: Efficient Iterative Data Processing on Large Clusters, pp. 1–12. IEEE, Piscataway (2010)

    Google Scholar 

  23. Rao, S., Ramakrishnan, R., Silberstein, A.: Sailfish: A Framework For Large Scale Data Processing, pp. 1–14. Microsoft, USA (2012)

    Google Scholar 

  24. Eui-Hong (Sam) Han, George Karypis, Member, IEEE, and Vipin Kumar, Fellow, IEEE: Scalable Parallel Data Mining for Association Rules, vol. 12, no. 3, pp. 25–34. IEEE, Piscataway (2000)

    Google Scholar 

  25. Azzini, A., Ceravolo, P.: Consistent Process Mining Over Big Data Triple Stores, pp. 25–34. IEEE, Piscataway (2013)

    Google Scholar 

  26. Hoi, S.C.H., Wang, J., Zhao, P., Jin, R.: Online Feature Selection For Mining Big Data, pp. 12–17. ACM, New York (2012)

    Google Scholar 

  27. Rakthanmano, T.: Addressing Big Data Time Series: Mining Trillions of Time Series Subsequences Under Dynamic Time Warping, pp. 23–33. ACM, New York (2013)

    Google Scholar 

  28. Laptev, N., Zeng, K., Zaniolo, C.: Very Fast Estimation for Result and Accuracy of Big Data Analytics: The EARL System, pp. 23–33. Springer, Heidelberg (2013)

    Google Scholar 

  29. Zhang, Y.: A Fast Online Learning Algorithm for Distributed Mining of BigData, pp. 213–313. IEEE, Piscataway (2012)

    Google Scholar 

  30. Yadav, C., Wang, S., Kumar, M.: Algorithm and approaches to handle data—a survey. International Journal of Computer Science and Network 2(3), 12–17 (2013)

    Google Scholar 

  31. Kang, U., Faloutsos, C.: Big graph mining: algorithms and discoveries. IEEE 14(2), 25–34 (1998)

    Google Scholar 

  32. Wu, X., Zhu, X., Wu, G.-Q.: Data mining with big data. IEEE 26(1), 97–107 (2014)

    Google Scholar 

  33. Chang, E.Y., Bai, H., Zhu, K.: Parallel algorithms for mining large-scale rich-media data. In: Proceedings of the 17th ACM Int’l Conference Multimedia (MM ’09,) pp. 917–918 (2009)

    Google Scholar 

  34. Wu, X., Zhang, S.: Synthesizing high-frequency rules from different data sources. IEEE Trans. Knowl. Data Eng. 15(2), 353–367 (2003)

    Article  Google Scholar 

  35. Chen, Y.-C., Peng, W.-C., Lee, S.-Y.: Efficient algorithms for influence maximization in social networks. Knowl. Inf. Syst. 33(3), 577–601 (2012)

    Article  Google Scholar 

  36. Zhao, J., Wu, J., Feng, X., Xiong, H., Xu, K.: Information propagation in online social networks: a tie-strength perspective. Knowl. Inf. Syst. 32(3), 589–608 (2012)

    Article  Google Scholar 

  37. Chu, C.T., Kim, S.K., Lin, Y.A., Yu, Y., Bradski, G.R., Ng, A.Y., Olukotun, K.: Map-reduce for machine learning on multicore. In: Proceedings of the 20th Annual Conference Neural Information Processing Systems (NIPS ’06), pp. 281–288 (2006)

    Google Scholar 

  38. Howe, D., et al.: Big data: the future of biocuration. Nature 455, 47–50 (2008)

    Article  Google Scholar 

  39. Huberman, B.: Sociology of science: big data deserve a bigger audience. Nature 482, 308 (2012)

    Article  Google Scholar 

  40. Wu, X., Zhu, X., Wu, G.-Q., Ding, W.: Data Mining with Big Data, pp. 23–33. IEEE, Piscataway (2013)

    Google Scholar 

  41. Mervis, J.: U.S. science policy: agencies rally to tackle big data. Science 336(6077), 22 (2012)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to D. Radhika .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Cite this paper

Radhika, D., Aruna Kumari, D. (2018). Adding Big Value to Big Businesses: A Present State of the Art of Big Data, Frameworks and Algorithms. In: Saini, A., Nayak, A., Vyas, R. (eds) ICT Based Innovations. Advances in Intelligent Systems and Computing, vol 653. Springer, Singapore. https://doi.org/10.1007/978-981-10-6602-3_17

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-6602-3_17

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-6601-6

  • Online ISBN: 978-981-10-6602-3

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics