Abstract
Processing streams rather than static files of Linked Data has gained increasing importance in the web of data. When processing datastreams system builders are faced with the conundrum of guaranteeing a constant maximum response time with limited resources and, possibly, no prior information on the data arrival frequency. One approach to address this issue is to delete data from a cache during processing - a process we call eviction. The goal of this paper is to show that datadriven eviction outperforms today’s dominant data-agnostic approaches such as first-in-first-out or random deletion.
Specifically, we first introduce a method called Clock that evicts data from a join cache based on the likelihood estimate of contributing to a join in the future. Second, using the well-established SR-Bench benchmark as well as a data set from the IPTV domain, we show that Clock outperforms data-agnostic approaches indicating its usefulness for resource-limited linked data stream processing.
The research leading to these results has received funding from the European Union Seventh Framework Program FP7/2007-2011 under grant agreement No.296126.
Chapter PDF
Similar content being viewed by others
Keywords
References
Harris, S., Seaborne, A.: SPARQL 1.1 Query Language. Technical report, The World Wide Web Consortium (W3C) (2011)
Calbimonte, J.-P., Corcho, O., Gray, A.J.G.: Enabling ontology-based access to streaming data sources. In: Patel-Schneider, P.F., Pan, Y., Hitzler, P., Mika, P., Zhang, L., Pan, J.Z., Horrocks, I., Glimm, B. (eds.) ISWC 2010, Part I. LNCS, vol. 6496, pp. 96–111. Springer, Heidelberg (2010)
Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: C-SPARQL: A Continuous Query Language for RDF Data Streams. International Journal of Semantic Computing (1), 3–25 (2010)
Le-Phuoc, D., Dao-Tran, M., Xavier Parreira, J., Hauswirth, M.: A Native and Adaptive Approach for Unified Processing of Linked Streams and Linked Data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 370–388. Springer, Heidelberg (2011)
Kietz, J.U., Scharrenbach, T., Fischer, L., Bernstein, A., Nguyen, K.: TEF-SPARQL: The DDIS query-language for time annotated event and fact Triple-Streams. Technical report, University of Zurich, Department of Informatics (2013)
Anicic, D., Fodor, P., Rudolph, S., Stojanovic, N.: EP-SPARQL: a unified language for event processing and stream reasoning. In: Proc. WWW, pp. 635–644 (2011)
Little, J.D.C.: A proof for the queuing formula: L= λ w. Operations Research 9(3), 383–387 (1961)
Das, A., Gehrke, J., Riedewald, M.: Approximate join processing over data streams. In: Proc. SIGMOD, New York, USA, pp. 40–51 (2003)
Babcock, B., Datar, M., Motwani, R.: Load shedding for aggregation queries over data streams. In: Proc. ICDE, pp. 350–361 (2004)
Tatbul, N., Çetintemel, U., Zdonik, S., Cherniack, M., Stonebraker, M.: Load Shedding in a Data Stream Manager. In: 29th International Conference VLDB, pp. 309–320 (2003)
Zhang, Y., Duc, P.M., Corcho, O., Calbimonte, J.-P.: SRBench: A Streaming RDF/SPARQL Benchmark. In: Cudré-Mauroux, P., Heflin, J., Sirin, E., Tudorache, T., Euzenat, J., Hauswirth, M., Parreira, J.X., Hendler, J., Schreiber, G., Bernstein, A., Blomqvist, E. (eds.) ISWC 2012, Part I. LNCS, vol. 7649, pp. 641–657. Springer, Heidelberg (2012)
Nguyen, K., Scharrenbach, T., Bernstein, A.: Eviction Strategies for Semantic Flow Processing Systems. In: Proc. SSWS (2013)
Cugola, G., Margara, A.: Processing flows of information. ACM Computing Surveys 44(3), 1–62 (2012)
Diao, Y., Immerman, N., Gyllstrom, D.: Sase+: An agile language for kleene closure over event streams. Technical report, University of Massachusetts Amherst, Department of Computer Science (2008)
Barbieri, D.F., Braga, D., Ceri, S., Della Valle, E., Grossniklaus, M.: Incremental reasoning on streams and rich background knowledge. In: Aroyo, L., Antoniou, G., Hyvönen, E., ten Teije, A., Stuckenschmidt, H., Cabral, L., Tudorache, T. (eds.) ESWC 2010, Part I. LNCS, vol. 6088, pp. 1–15. Springer, Heidelberg (2010)
Stocker, M., Seaborne, A., Bernstein, A., Kiefer, C., Reynolds, D.: Sparql basic graph pattern optimization using selectivity estimation. In: Proc. WWW, pp. 595–604 (2008)
Bowman, I., Paulley, G.: Join enumeration in a memory-constrained environment. In: Proc. ICDE, pp. 645–654 (2000)
Naidu, K., Rastogi, R., Satkin, S., Srinivasan, A.: Memory-constrained aggregate computation over data streams. In: Proc. ICDE, pp. 852–863 (2011)
Marian, A., Siméon, J.: Projecting XML documents. In: Proc. VLDB, pp. 213–224 (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Gao, S., Scharrenbach, T., Bernstein, A. (2014). The CLOCK Data-Aware Eviction Approach: Towards Processing Linked Data Streams with Limited Resources. In: Presutti, V., d’Amato, C., Gandon, F., d’Aquin, M., Staab, S., Tordai, A. (eds) The Semantic Web: Trends and Challenges. ESWC 2014. Lecture Notes in Computer Science, vol 8465. Springer, Cham. https://doi.org/10.1007/978-3-319-07443-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-07443-6_2
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-07442-9
Online ISBN: 978-3-319-07443-6
eBook Packages: Computer ScienceComputer Science (R0)