Skip to main content

Plan Before You Execute: A Cost-Based Query Optimizer for Attributed Graph Databases

  • Conference paper
  • First Online:
Big Data Analytics and Knowledge Discovery (DaWaK 2016)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9829))

Included in the following conference series:

Abstract

Proliferation of NoSQL and graph databases indicates a move towards alternate forms of data representation beyond the traditional relational data model. This raises the question of processing queries efficiently over these representations. Graphs have become one of the preferred ways to represent and store data related to social networks and other domains where relationships and their labels need to be captured explicitly. Currently, for querying graph databases, users have to either learn a new graph query language (e.g. Metaweb Query language or MQL [6]) for posing their queries or use customized searches of specific substructures [14]. Hence, there is a clear need for posing queries using the same representation as that of a graph database, generate and evaluate alternate plans, develop cost metrics for evaluating plans, and prune the search space to converge on a good plan that can be evaluated directly over the graph database.

In this paper, we propose an approach for effective evaluation of queries specified over graph databases. The proposed optimizer generates query plans systematically and evaluates them using appropriate cost metrics gleaned from the graph database. For the time being, a graph mining algorithm has been modified for evaluating a given query plan using constrained expansion. Relevant metadata pertaining to the graph database is collected and used for evaluating a query plan using a branch and bound algorithm. Experiments on different types of queries over two graph databases (Internet Movie Database or IMDB and DBLP) are performed to validate our approach. Experimental results show that the query plan generated by our system results in exploring significantly fewer portions of the graph as compared to any other query plan for the same query.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. http://www.informatik.uni-trier.de

  2. http://www.imdb.com/

  3. http://neo4j.com/

  4. http://ailab.wsu.edu/subdue

  5. Batra, S., Tyagi, C.: Comparative analysis of relational and graph databases. Int. J. Soft Comput. Eng. (IJSCE) 2(2), 509–512 (2012)

    Google Scholar 

  6. Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: SIGMOD 2008, pp. 1247–1250. ACM, New York (2008)

    Google Scholar 

  7. Cheng, J., Ke, Y., Ng, W., Lu, A.: Fg-index: towards verification-free query processing on graphdatabases. In: SIGMOD, pp. 857–872 (2007)

    Google Scholar 

  8. Garey, M.R., Johnson, D.S.: Computers and Intractability: A Guide to the Theory of NP-Completeness. W. H. Freeman & Co., New York (1979)

    MATH  Google Scholar 

  9. Giugno, R., Shasha, D.: Graphgrep: a fast and universal method for querying graphs. In: 16th International Conference on Pattern Recognition, ICPR 2002, Quebec, Canada, 11–15 August 2002, pp. 112–115 (2002)

    Google Scholar 

  10. Goyal, A.: QP-SUBDUE: Processing Queries Over Graph Databases. Master’s thesis, The University of Texas at Arlington, December 2015

    Google Scholar 

  11. Holder, L.B., Cook, D.J., Djoko, S.: Substucture discovery in the SUBDUE system. In: Knowledge Discovery and Data Mining, pp. 169–180 (1994)

    Google Scholar 

  12. Holzschuher, F., Peinl, R.: Performance of graph query languages: comparison of cypher, gremlinand native access in neo4j. In: Proceedings of the Joint EDBT/ICDT 2013 Workshops, pp. 195–204. ACM (2013)

    Google Scholar 

  13. Jarke, M., Koch, J.: Query optimization in database systems. ACM Comput. Surv. (CsUR) 16(2), 111–152 (1984)

    Article  MathSciNet  MATH  Google Scholar 

  14. Suri, S., Vassilvitskii, S.: Counting triangles and the curse of the last reducer. In: World Wide Web Conference Series, pp. 607–614 (2011)

    Google Scholar 

  15. Tian, Y., Patel, J.M.: TALE: a tool for approximate large graph matching. In: ICDE 2008, 7–12 April 2008, Cancún, México (2008)

    Google Scholar 

  16. Tong, H., Faloutsos, C., Gallagher, B., Eliassi-Rad, T.: Fast best-effort pattern matching in large attributed graphs. In: SIGKDD, pp. 737–746 (2007)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Soumyava Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Das, S., Goyal, A., Chakravarthy, S. (2016). Plan Before You Execute: A Cost-Based Query Optimizer for Attributed Graph Databases. In: Madria, S., Hara, T. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2016. Lecture Notes in Computer Science(), vol 9829. Springer, Cham. https://doi.org/10.1007/978-3-319-43946-4_21

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-43946-4_21

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-43945-7

  • Online ISBN: 978-3-319-43946-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics