Skip to main content
  • 1169 Accesses

Abstract

The emergence of the Web has increased interests in XML data. XML query languages such as XQuery and XPath use label paths to traverse the irregularly structured data. Without a structural summary and efficient index, query processing can be quite inefficient due to an exhaustive traversal on XML data. To overcome the inefficiency, several path indexes have been proposed in the research community. DataGuides and the 1-Index can be viewed as covering indexes, for simple path expressions over tree- or graph-structured XML data. By representing both XML documents and XML queries in structure-encoded sequences, querying XML data is equivalent to finding subsequence matches. We will also introduce the above index structures in this chapter.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 109.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 139.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Agrawal, R., Srikant, R.: Mining sequential patterns. In: Proceedings of the 11th International Conference on Data Engineering, Taipei, pp. 3–14, Mar 1995

    Google Scholar 

  2. Buneman, P., Davidson, S.B., Fernandez, M.F., Suciu, D.: Adding structure to unstructured data. In: Proceedings of the 6th International Conference on Database Theory, Delphi, pp. 336–350, Jan 1997

    Google Scholar 

  3. Cattell, R.G.G. (ed.): The Object Database Standard: ODMG-93. Morgan Kaufmann, San Francisco (1994)

    Google Scholar 

  4. Clark, J., Derose, S.: XML path language (XPath) 1.0. W3C Recommendation. World Wide Web Consortium, http://www.w3.org/TR/xpath, Nov 1999

  5. Cohen, E., Kaplan, H., Milo, T.: Labeling dynamic XML trees. In: PODS, Madison, pp.271–281 (2002)

    Google Scholar 

  6. Chung, C-W., Min, K-K., Shim, K.: APEX: an adaptive path index for XML data. In: SIGMOD, Madison, pp. 121–132 (2002)

    Google Scholar 

  7. Garofalakis, M.N., Rastogi, R., Shim, K.: SPIRIT: sequential pattern mining with regular expression constraints. In: Proceedings of 25th International Conference on Very Large Data Bases, Edinburgh, pp. 223–234, Sept 1999

    Google Scholar 

  8. Goldman, R., Widom, J.: DataGuides: enabling query formulation and optimization in semistructured databases. In: VLDB, Athens, pp. 436–445 (1997)

    Google Scholar 

  9. Henziner, M., Henziner, T., Kopke, P.: Computing simulations on finite and infinite graphs. In: Proceedings of 20th Symposium on Foundations of Computer Science, Milwaukee, Wisconsin, USA, pp. 453–462 (1995)

    Google Scholar 

  10. The internet movie database: http://www.imdb.com (2000)

  11. Kaushik, R., Bohannon, P., Naughton, J.F., Korth, H.F.: Covering indexes for branching path queries. In: Proceedings of SIGMOD 2002, Madison, (2002)

    Google Scholar 

  12. Kaushik, R., Shenoy, P., Bohannon, P., Gudes, E.: Exploiting local similarity for efficient indexing of paths in graph structured data. In: Proceedings of ICDE, San Jose (2002)

    Google Scholar 

  13. Ley, M.: DBLP database web site. http://www.informatik.uni-trier.de/ley/db (2000)

  14. Li, Q., Moon, B.: Indexing and querying XML data for regular path expressions. In: Proceedings of the 27th VLDB Conference, pp. 361–370, Sept 2001

    Google Scholar 

  15. Milner, R.: A Calculus for Communicating Processes. Lecture Notes in Computer Science, vol. 92. (1980)

    Book  Google Scholar 

  16. Milner, R.: Communication and Concurrency. Prentice Hall, New York (1989)

    MATH  Google Scholar 

  17. Milo, T., Suciu, D.: Index structures for path expressions. In: ICDT, Jerusalem, pp. 277–295 (1999)

    Google Scholar 

  18. Ng, R.T., Lakshmanan, L.V.S., Han, J., Pang, A.: Exploratory mining and pruning optimizations of constrained association rules. In: Proceedings of the 1998 ACM SIGMOD International Conference on Management of Data, Seattle, pp. 13–24, June 1998

    Google Scholar 

  19. Nestorov, S., Ullman, J.D., Wiener, J.L., Chawathe, S.S.: Representative objects: concise representations of semistructured, hierarchial data. In: Proceedings of the 13th International Conference on Data Engineering, Birmingham, pp. 79–90, Apr 1997

    Google Scholar 

  20. Papakonstantinou, Y., Garcia-molina, H., Widom, J.: Object exchange across heterogeneous information source. In: Proceeding of the 11th International Conference on Data Engineering, Taipei, pp. 251–260 (1995)

    Google Scholar 

  21. Prüfer, H.: Neuer Beweis eines Satzes über Permutationen. Arch. Math. Phys. 27, 742–744 (1918)

    Google Scholar 

  22. Paige, R., Tarjan, R.: Three partition refinement algorithms. SIAM J. Commun. 16, 973–988 (1987)

    Article  MathSciNet  MATH  Google Scholar 

  23. Rao, P., Moon, B.: PRIX: indexing and querying XML using Prüfer sequences. Technical Report TR 03-06, University Of Arizona, Tucson, AZ 85721. http://www.cs.arizona.edu/research/reports.html, July 2003

  24. Stockmeyer, L.J., Meyer, A.R.: Word problems requiring exponential time. In: 5th STOC. ACM, Austin, pp. 1–9 (1973)

    Google Scholar 

  25. Wang, H., Park, S., Fan, W., Yu, F.S.: ViST: a dynamic index method for query XML data by tree structures. In: Proceeding of the 2003 ACM-SIGMOD Conference, San Diego, CA, June 2003

    Google Scholar 

  26. XMARK.: The XML-benchmark project. http://monetdb.cwi.nl/xml (2002)

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Tsinghua University Press, Beijing and Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Lu, J. (2013). XML Data Indexing. In: An Introduction to XML Query Processing and Keyword Search. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-34555-5_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-34555-5_3

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-34554-8

  • Online ISBN: 978-3-642-34555-5

  • eBook Packages: Computer Science

Publish with us

Policies and ethics