Abstract
Recently, cloud-based data stream processing has emerged to process huge amounts of data. During such processing, the actual characteristics of data streams may vary, e.g., in terms of volume or velocity. For example, in the financial domain hectic markets can cause bursty streams of events leading to changes of the stream characteristics by several orders of magnitude. To handle such situations, adaptation of the data processing at runtime is desirable. While several techniques for changing data stream processing at runtime do exist, one specific challenge is to minimize the impact of runtime adaptation on the data processing, in particular for real-time data analytics.
In this research work, we aim at performing runtime adaptation in cloud-based data stream processing, namely, dynamically switching alternative distributed algorithms, which have similar functionality, but operate at different characteristics (tradeoffs). The goal of this work is to provide a generic approach which can automatically determine the algorithm switch with minimized impact on the data processing. To achieve this goal, we introduce the concept of a “safe” (transparent, gap-free) switch, which takes the characteristics of alternative algorithms into account. For the actual switch, we combine stream re-routing with buffering and stream synchronization along with a support of dynamic deployment of alternative stream processing algorithms into the cloud.
Supervisors: Klaus Schmid, Holger Eichelberger.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apache spark. Lightning-fast cluster computing. http://spark.apache.org/. Accessed 06 Oct 2016
Apache storm. Distributed and fault-tolerant realtime computation. http://storm.apache.org/. Accessed 06 Oct 2016
Andrade, H.C.M., Gedik, B., Turaga, D.S.: Fundamentals of Stream Processing: Application Design, Systems, and Analytics. Cambridge University Press, Cambridge (2014)
Assunção, M.D., Calheiros, R.N., Bianchi, S., Netto, M.A., Buyya, R.: Big data computing and clouds: trends and future directions. J. Parallel Distrib. Comput. 79, 3–15 (2015)
Balkesen, C., Tatbul, N., Özsu, M.T.: Adaptive input admission and management for parallel stream processing. In: Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems (DEBS), pp. 15–26. ACM (2013)
Brito, A.: Optimistic parallelization support for event stream processing systems. In: Proceedings of the 5th Middleware Doctoral Symposium, pp. 7–12 (2008)
Calheiros, R.N., Vecchiola, C., Karunamoorthy, D., Buyya, R.: The aneka platform and qos-driven resource provisioning for elastic applications on hybrid clouds. Future Gener. Comput. Syst. 28, 861–870 (2012)
Cervino, J., Kalyvianaki, E., Salvachua, J., Pietzuch, P.: Adaptive provisioning of stream processing systems in the cloud. In: 2012 IEEE 28th International Conference on Data Engineering Workshops (ICDEW), pp. 295–301. IEEE (2012)
Chang, J.H., Kum, H.-C.M.: Frequency-based load shedding over a data stream of tuples. Inf. Sci. 179(21), 3733–3744 (2009)
Chatzistergiou, A., Viglas, S.D.: Fast heuristics for near-optimal task allocation in data stream processing over clusters. In: Proceedings of the 23rd ACM International Conference on Information and Knowledge Management (CIKM), pp. 1579–1588 (2014)
Collins, R.L., Carloni, L.P.: Flexible filters: load balancing through backpressure for stream programs. In: Proceedings of the Seventh ACM International Conference on Embedded Software (EMSOFT), pp. 205–214 (2009)
Das, T., Zhong, Y., Stoica, I., Shenker, S.: Adaptive stream processing using dynamic batch sizing. In: Proceedings of the ACM Symposium on Cloud Computing (SOCC), pp. 16:1–16:13 (2014)
Goudarzi, H., Salavati, A.H., Pakravan, M.R.: An ant-based rate allocation algorithm for media streaming in peer to peer networks: extension to multiple sessions and dynamic networks. J. Netw. Comput. Appl. 34(1), 327–340 (2011)
Heinze, T., Meyer, P., Jerzak, Z., Fetzer, C.: Measuring and estimating monetary cost for cloud-based data stream processing (demo). In: Proceedings of the 7th ACM International Conference on Distributed Event-Based Systems (DEBS), pp. 333–334 (2013)
Heinze, T., Pappalardo, V., Jerzak, Z., Fetzer, C.: Auto-scaling techniques for elastic data stream processing. In: Proceedings of the 8th ACM International Conference on Distributed Event-Based Systems (DEBS), pp. 318–321 (2014)
Hwang, J.-H., Balazinska, M., Rasin, A., Çetintemel, U., Stonebraker, M., Zdonik, S.: High-availability algorithms for distributed stream processing. In: International Conference on Data Engineering (ICDE), pp. 779–790 (2005)
Rundensteiner, E.A., Ding, L., Zhu, Y., Sutherland, T.M., Pielech, B.: CAPE: a constraint-aware adaptive stream processing engine. Stream Data Manag. 30, 83–111 (2005). Springer
Madsen, K.G.S., Zhou, Y.: Dynamic resource management in a massively parallel stream processing engine. In: Proceedings of the 24th ACM International on Information and Knowledge Management, pp. 13–22 (2015)
Marz, N.: Storm-deploy. https://github.com/nathanmarz/storm-deploy/. Accessed 06 Oct 2016
Qin, C., Eichelberger, H.: Impact-minimizing runtime switching of distributed stream processing algorithms. In: Big Data Processing - Reloaded Workshop of the EDBT/ICDT Joint Conference (2016)
Satzger, B., Hummer, W., Leitner, P., Dustdar, S.: ESC: towards an elastic stream computing platform for the cloud. In: 4th IEEE International Conference on Cloud Computing (CLOUD), pp. 348–355 (2011)
Schneider, S., Hirzel, M., Gedik, B., Wu, K.-L.: Auto-parallelizing stateful distributed streaming applications. In: Proceedings of the 21st International Conference on Parallel Architectures and Compilation Techniques, pp. 53–64 (2012)
Vijayakumar, S., Zhu, Q., Agrawal, G.: Dynamic resource provisioning for data streaming applications in a cloud environment. In: Proceedings of the 2nd Cloud Computing Technology and Science (CloudCom), pp. 441–448 (2010)
Wei, M., Rundensteiner, E.A., Mani, M., Li, M.: Processing recursive xquery over xml streams: the raindrop approach. Data Knowl. Eng. 65(2), 243–265 (2008)
Wei, Y., Son, S.H., Stankovic, J.A.: RTSTREAM: real-time query processing for data streams. In: International Symposium on Object and Component-Oriented Real-Time Distributed Computing (ISORC 2006), pp. 141–150 (2006)
Acknowledgments
This work is partially supported by the European Commission in the 7th framework programme through the QualiMaster project (grant 619525).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Qin, C. (2018). Impact-Minimizing Runtime Adaptation in Cloud-Based Data Stream Processing. In: Lazovik, A., Schulte, S. (eds) Advances in Service-Oriented and Cloud Computing. ESOCC 2016. Communications in Computer and Information Science, vol 707. Springer, Cham. https://doi.org/10.1007/978-3-319-72125-5_22
Download citation
DOI: https://doi.org/10.1007/978-3-319-72125-5_22
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-72124-8
Online ISBN: 978-3-319-72125-5
eBook Packages: Computer ScienceComputer Science (R0)