Abstract
Graph data are of growing importance in many recent applications. There are many systems proposed in the last decade for graph processing and analysis. Unfortunately, with the exception of RDF stores, every system uses different datasets and queries to assess its scalability and efficiency. This makes it challenging (and sometimes impossible) to conduct a meaningful comparison. Our aim is to close this gap by introducing Waterloo Graph Benchmark (WGB), a benchmark for graph processing systems that offers an efficient generator that creates dynamic graphs with properties similar to real-life ones. WGB includes the basic graph queries which are used for building graph applications.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
References
Dominguez-Sal, D., Martinez-Bazan, N., Muntes-Mulero, N., Baleta, P., Larriba-Pay, J.L.: A discussion on the design of graph database benchmarks. In: Proceedings of 2nd TPC Technology Conference on Performance Evaluation, Measurement and Characterization of Complex Systems, pp. 25–40 (2011)
Ciglan, M., Averbuch, A., Hluchy, L.: Benchmarking traversal operations over graph databases. In: Proceedings Workshops of 28th International Conference on Data Engineering, pp. 186–189 (2012)
Malewicz, G., Austern, M.H., Bik, A.J., Dehnert, J.C., Horn, I., Leiser, N., Czajkowski, G.: Pregel: a system for large-scale graph processing. In: Proceedings of 2010 ACM SIGMOD International Conference on Management of Data, SIGMOD ’10, pp. 135–146 (2010)
Low, Y., Gonzalez, J., Kyrola, A., Bickson, D., Guestrin, C., Hellerstein, J.M.: Distributed graphlab: a framework for machine learning in the cloud. Proc. VLDB Endow. 5(8), 716–727 (2012)
Ghazal, A., Rabl, T., Hu, M., Raab, F., Poess, M., Crolotte, A., Jacobsen, H.-A.: Bigbench: towards an industry standard benchmark for big data analytics. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 1197–1208. ACM (2013)
Wang, L., Zhan, J., Luo, C., Zhu, Y., Yang, Q., He, Y., Gao, W., Jia, Y., Shi, Y., Zhang, S., et al.: Bigdatabench: A big data benchmark suite from internet services (2014). arXiv preprint arXiv:1401.1406
Leskovec, J., Chakrabarti, D., Kleinberg, J., Faloutsos, C., Ghahramani, Z.: Kronecker graphs: an approach to modeling networks. J. Mach. Learn. Res. 11, 985–1042 (2010)
Ming, Z., Luo, C., Gao, W., Han, R., Yang, Q., Wang, L., Zhan, J.: BDGS: a scalable big data generator suite in big data benchmarking (2014). arXiv preprint arXiv:1401.5465
Appel, A.P., Faloutsos, C., Junior, C.T.: Graph mining techniques: focusing on discriminating between real and synthetic graphs. Bioinformatics: Concepts, Methodologies, Tools, and Applications, vol. 3, pp. 446–464. Information Resources Management Association, USA (2013)
Bizer, C., Schultz, A.: The Berlin SPARQL benchmark. J. Seman. Web Inf. Syst. 5(2), 1–24 (2009)
Morsey, M., Lehmann, J., Auer, S., Ngonga Ngomo, A.-C.: DBpedia SPARQL benchmark – performance assessment with real queries on real data. In: Aroyo, L., Welty, C., Alani, H., Taylor, J., Bernstein, A., Kagal, L., Noy, N., Blomqvist, E. (eds.) ISWC 2011, Part I. LNCS, vol. 7031, pp. 454–469. Springer, Heidelberg (2011)
Guo, Y., Pan, Z., Heflin, J.: LUBM: A benchmark for OWL knowledge base systems. Web Seman.: Sci. Serv. Agents World Wide Web 3(2), 158–182 (2005)
Schmidt, M., Hornung, T., Lausen, G., Pinkel, C.: SP\(^2\)l SPARQL performance benchmark. In: Proceedings of 25th International Conferrence on Data Engineering, pp. 222–233 (2009)
Duan, S., Kementsietsidis, A., Srinivas, K., Udrea, O.: Apples and oranges: a comparison of RDF benchmarks and real RDF datasets. In: Proceedings of ACM SIGMOD International Conference on Management of Data, pp. 145–156 (2011)
Aluç, G., Özsu, M.T., Daudjee, K., Hartig, O.: Chameleon-db: a workload-aware robust RDF data management system, University of Waterloo, Technical report, CS-2013-10(2013)
Yu, J., Cheng, J.: Graph reachability queries: a survey. In: Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data. Advances in Database Systems, vol. 40, pp. 181–215. Springer, Heidelberg (2010)
Spillane, S.R., Birnbaum, J., Bokser, D., Kemp, D., Labouseur, A., Olsen, P.W., Vijayan, J., Hwang, J.-H., Yoon, J.-W.: A demonstration of the G* graph database system. In: 2013 IEEE 29th International Conference on Data Engineering (ICDE), Los Alamitos, CA, USA, pp. 1356–1359. IEEE Computer Society (2013)
Aggarwal, C.C., Wang, H.: A survey of clustering algorithms for graph data. In: Aggarwal, C.C., Wang, H. (eds.) Managing and Mining Graph Data. Advances in Database Systems, vol. 40. Springer, Heidelberg (2010)
Akoglu, L., Faloutsos, C.: RTG: a recursive realistic graph generator using random typing. Data Min. Knowl. Disc. 19(2), 194–209 (2009)
Miller, G.A.: Some effects of intermittent silence. Am. J. Psychol. 70(2), 311–314 (1957)
Leskovec, J., Kleinberg, J., Faloutsos, C.: Graph evolution: densification and shrinking diameters. ACM Trans. Knowl. Discov. Data, 1(1), Article 2, pp. 1–41 (2007)
Kang, U., Tsourakakis, C.E., Faloutsos, C.: PEGASUS: a peta-scale graph mining system implementation and observations. In: Proceedings of IEEE International Conference on Data Mining, 2009, pp. 229–238 (2009)
Bu, Y., Howe, B., Balazinska, M., Ernst, M.D.: The HaLoop approach to large-scale iterative data analysis. VLDB J. 21(2), 169–190 (2012)
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. In: Proceedings of 6th USENIX Symposium on Operating System Design and Implementation, pp. 137–149 (2004)
Valiant, L.G.: A bridging model for parallel computation. Commun. ACM 33(8), 103–111 (1990)
Acknowledgments
This research was partially supported by a fellowship from IBM Centre for Advanced Studies (CAS), Toronto.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Ammar, K., Özsu, M.T. (2014). WGB: Towards a Universal Graph Benchmark. In: Rabl, T., Raghunath, N., Poess, M., Bhandarkar, M., Jacobsen, HA., Baru, C. (eds) Advancing Big Data Benchmarks. WBDB WBDB 2013 2013. Lecture Notes in Computer Science(), vol 8585. Springer, Cham. https://doi.org/10.1007/978-3-319-10596-3_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-10596-3_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10595-6
Online ISBN: 978-3-319-10596-3
eBook Packages: Computer ScienceComputer Science (R0)