Skip to main content

Managing Lifecycle of Big Data Applications

  • Conference paper
  • First Online:
Knowledge Engineering and Semantic Web (KESW 2017)

Abstract

The growing digitization and networking process within our society has a large influence on all aspects of everyday life. Large amounts of data are being produced continuously, and when these are analyzed and interlinked they have the potential to create new knowledge and intelligent solutions for economy and society. To process this data, we developed the Big Data Integrator (BDI) Platform with various Big Data components available out-of-the-box. The integration of the components inside the BDI Platform requires components homogenization, which leads to the standardization of the development process. To support these activities we created the BDI Stack Lifecycle (SL), which consists of development, packaging, composition, enhancement, deployment and monitoring steps. In this paper, we show how we support the BDI SL with the enhancement applications developed in the BDE project. As an evaluation, we demonstrate the applicability of the BDI SL on three pilots in the domains of transport, social sciences and security.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    http://hadoop.apache.org/.

  2. 2.

    https://spark.apache.org/.

  3. 3.

    https://flink.apache.org/.

  4. 4.

    https://hive.apache.org/.

  5. 5.

    https://www.big-data-europe.eu/bdi-components/.

  6. 6.

    At the time of writing, more than 30 components are available in BDE Components Library.

  7. 7.

    https://hortonworks.com/.

  8. 8.

    https://www.cloudera.com/.

  9. 9.

    https://mapr.com.

  10. 10.

    http://bigtop.apache.org/.

  11. 11.

    https://dcos.io/.

  12. 12.

    https://jujucharms.com/store.

  13. 13.

    https://github.com/big-data-europe/.

  14. 14.

    https://github.com/big-data-europe/app-stack-builder.

  15. 15.

    https://kafka.apache.org/.

  16. 16.

    https://github.com/big-data-europe/app-pipeline-builder.

  17. 17.

    https://github.com/big-data-europe/mu-pipeline-service.

  18. 18.

    https://github.com/big-data-europe/mu-bde-logging.

  19. 19.

    https://github.com/big-data-europe/app-integrator-ui.

  20. 20.

    https://github.com/big-data-europe/app-swarm-ui.

  21. 21.

    https://www.big-data-europe.eu/pilot-transport/.

  22. 22.

    https://www.elastic.co/products/elasticsearch.

  23. 23.

    https://www.elastic.co/products/kibana.

  24. 24.

    https://github.com/big-data-europe/pilot-sc4-fcd-applications.

  25. 25.

    https://www.big-data-europe.eu/pilot-social-sciences/.

  26. 26.

    http://linkedeconomy.org/en.

  27. 27.

    https://github.com/LinkedEcon/LinkedEconomyOntology-ELOD.

  28. 28.

    http://www.accountingverse.com/managerial-accounting/fs-analysis/financial-ratios.html.

  29. 29.

    http://flume.apache.org.

  30. 30.

    https://docs.oracle.com/javase/tutorial/sound/SPI-intro.html.

  31. 31.

    https://virtuoso.openlinksw.com/.

  32. 32.

    https://en.wikipedia.org/wiki/Financial_ratio.

  33. 33.

    https://www.w3.org/TR/sparql11-query/.

  34. 34.

    https://www.poolparty.biz/.

  35. 35.

    https://www.poolparty.biz/poolparty-semantic-graph-search-server/.

  36. 36.

    https://github.com/big-data-europe/pilot-sc6-cycle2.

  37. 37.

    https://docs.docker.com/engine/reference/builder/#healthcheck.

  38. 38.

    https://github.com/big-data-europe/mu-bde-logging.

  39. 39.

    https://www.big-data-europe.eu/security/.

  40. 40.

    http://cassandra.apache.org/.

  41. 41.

    http://www.gadm.org/.

  42. 42.

    https://scihub.copernicus.eu/.

  43. 43.

    The typical loading time for a set of images: 400 s.

  44. 44.

    The typical Spark job execution time for Change Detector: 1000 s.

  45. 45.

    https://github.com/big-data-europe/pilot-sc7-change-detector.

References

  1. Auer, S., et al.: The BigDataEurope platform – supporting the variety dimension of big data. In: Cabot, J., Virgilio, R., Torlone, R. (eds.) ICWE 2017. LNCS, vol. 10360, pp. 41–59. Springer, Cham (2017). doi:10.1007/978-3-319-60131-1_3. http://jens-lehmann.org/files/2017/icwe_bde.pdf

    Chapter  Google Scholar 

  2. Ermilov, I.: Scalable spark/hdfs workbench using docker (2016), https://www.big-data-europe.eu/scalable-sparkhdfs-workbench-using-docker/. Retrieved 21 May 2017

  3. Ermilov, I.: Developing spark applications with docker and BDE (2017), https://www.big-data-europe.eu/developing-spark-applications-with-docker-and-bde/. Retrieved 21 May 2017

  4. Ermilov, I.: User interface integration in BDI platform (integrator UI application) (2017), https://www.big-data-europe.eu/user-interface-integration-in-bdi-platform-integrator-ui-application/. Retrieved 21 May 2017

  5. Grady, N.W.: KDD meets big data. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1603–1608. IEEE (2016)

    Google Scholar 

  6. Harney, J., Lim, S.H., Sukumar, S., Stansberry, D., Xenopoulos, P.: On-demand data analytics in HPC environments at leadership computing facilities: Challenges and experiences. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 2087–2096. IEEE (2016)

    Google Scholar 

  7. Heit, J., Liu, J., Shah, M.: An architecture for the deployment of statistical models for the big data era. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1377–1384. IEEE (2016)

    Google Scholar 

  8. Jabeen, H.: Bde vs. other hadoop distributions (2016), https://www.big-data-europe.eu/bde-vs-other-hadoop-distributions/. Retrieved 21 May 2017

  9. Konstantopoulos, S., Charalambidis, A., Mouchakis, G., Troumpoukis, A., Jakobitch, J., Karkaletsis, V.: Semantic web technologies and big data infrastructures: SPARQL federated querying of heterogeneous big data stores. In: ISWC Demos and Posters Track (2016)

    Google Scholar 

  10. Kyzirakos, K., Karpathiotakis, M., Koubarakis, M.: Strabon: a semantic geospatial DBMS. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012. LNCS, vol. 7649, pp. 295–311. Springer, Heidelberg (2012). doi:10.1007/978-3-642-35176-1_19

    Chapter  Google Scholar 

  11. Kyzirakos, K., Vlachopoulos, I., Savva, D., Manegold, S., Koubarakis, M.: Geotriples: a tool for publishing geospatial data as RDF graphs using R2RML mappings. In: Proceedings of the 2014 International Conference on Posters & Demonstrations Track, vol. 1272, pp. 393–396. CEUR-WS. org (2014)

    Google Scholar 

  12. Nikolaou, C., Dogani, K., Bereta, K., Garbis, G., Karpathiotakis, M., Kyzirakos, K., Koubarakis, M.: Sextant: Visualizing time-evolving linked geospatial data. Web Semant. Sci. Serv. Agents World Wide Web 35, 35–52 (2015)

    Article  Google Scholar 

  13. Rahman, F., Slepian, M., Mitra, A.: A novel big-data processing framework for healthcare applications: big-data-healthcare-in-a-box. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 3548–3555. IEEE (2016)

    Google Scholar 

  14. Rodriguez, P., Haghighatkhah, A., Lwakatare, L.E., Teppola, S., Suomalainen, T., Eskeli, J., Karvonen, T., Kuvaja, P., Verner, J.M., Oivo, M.: Continuous deployment of software intensive products and services: a systematic mapping study. J. Syst. Softw. 123, 263–291 (2017)

    Article  Google Scholar 

  15. Sebrechts, M., Borny, S., Vanhove, T., Van Seghbroeck, G., Wauters, T., Volckaert, B., De Turck, F.: Model-driven deployment and management of workflows on analytics frameworks. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 2819–2826. IEEE (2016)

    Google Scholar 

  16. Sezer, O.B., Dogdu, E., Ozbayoglu, M., Onal, A.: An extended iot framework with semantics, big data, and analytics. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 1849–1856. IEEE (2016)

    Google Scholar 

  17. Shearer, C.: The CRISP-DM model: the new blueprint for data mining. J. Data Warehouse. 5(4), 13–22 (2000)

    Google Scholar 

  18. Tsakalozos, K., Johns, C., Monroe, K., VanderGiessen, P., Mcleod, A., Rosales, A.: Open big data infrastructures to everyone. In: 2016 IEEE International Conference on Big Data (Big Data), pp. 2127–2129. IEEE (2016)

    Google Scholar 

  19. Versteden, A., Pauwels, E.: State-of-the-dart web applications using microservices and linked data. In: Maleshkova, M., Verborgh, R., Keppmann, F.L. (eds.) 4th Workshop on Services and Applications over Linked APIs and Data (SALAD), vol. 1629, pp. 25–36. CEUR Workshop Proceedings, Aachen (2016). http://ceur-ws.org/Vol-1629/paper4.pdf

Download references

Acknowledgments

This work was supported by grant from the European Union’s Horizon 2020 research Europe flag and innovation program for the project Big Data Europe (GA no. 644564).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ivan Ermilov .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Ermilov, I. et al. (2017). Managing Lifecycle of Big Data Applications. In: Różewski, P., Lange, C. (eds) Knowledge Engineering and Semantic Web. KESW 2017. Communications in Computer and Information Science, vol 786. Springer, Cham. https://doi.org/10.1007/978-3-319-69548-8_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-69548-8_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-69547-1

  • Online ISBN: 978-3-319-69548-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics