XML Indexing

Dong, Xin Luna; Srivastava, Divesh

doi:10.1007/978-1-4614-8265-9_779

Xin Luna Dong³ &
Divesh Srivastava⁴

13 Accesses

Definition

XML employs an ordered, tree-structured model for representing data. Queries in XML languages like XQuery employ twig queries to match relevant portions of data in an XML database. An XML Index is a data structure that is used to efficiently look up all matches of a fragment of the twig query, where some of the twig query fragment nodes may have been mapped to specific nodes in the XML database.

Historical Background

XML path indexing is related to the problem of join indexing in relational database systems [15] and path indexing in object-oriented database systems (see, e.g., [1, 9]). These index structures assume that the schema is homogeneous and known; these assumptions do not hold in general for XML data. The DataGuide [7] was the first path index designed specifically for XML data, where the schema may be heterogeneous and may not even be known.

Foundations

Notation

An XML document dis a rooted, ordered, node-labeled tree, where (i) each node corresponds to an XML...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 4,499.99; Price excludes VAT (USA)

Hardcover Book: USD 6,499.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Recommended Reading

Bertino E, Kim W. Indexing techniques for queries on nested objects. IEEE Trans Knowl Data Eng. 1989;1(2):196–214.
Article Google Scholar
Bruno N, Koudas N, Srivastava D. Holistic twig joins: optimal XML pattern matching. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 310–21.
Google Scholar
Chen Z, Gehrke J, Korn F, Koudas N, Shanmugasundaram J, Srivastava D. Index structures for matching XML twigs using relational query processors. Data Knowl Eng. 2007;60(2):283–302.
Article Google Scholar
Chung C-W, Min J-K, Shim K. APEX: an adaptive path index for XML data. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 121–32.
Google Scholar
Cohen E, Kaplan H, Milo T. Labeling dynamic XML trees. In: Proceedings of the ACM SIGACT-SIGMOD Symposium on Principles of Database Systems; 2002. p. 271–81.
Google Scholar
Cooper BF, Sample N, Franklin MJ, Hjaltason GR, Shadmon M. A fast index for semistructured data. In: Proceedings of the 27th International Conference on Very Large Data Bases; 2001. p. 341–50.
Google Scholar
Goldman R, Widom J. Data guides: enabling query formulation and optimization in semistructured databases. In: Proceedings of the 23th International Conference on Very Large Data Bases; 1997. p. 436–45.
Google Scholar
Grust T. Accelerating XPath location steps. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 109–20.
Google Scholar
Kemper A, Moerkotte G. Access support in object bases. ACM SIGMOD Rec. 1990;19(2):364–74.
Article MATH Google Scholar
Kha DD, Yoshikawa M, Uemura S. An XML indexing structure with relative region coordinate. In: Proceedings of the 17th International Conference on Data Engineering; 2001. p. 313–20.
Google Scholar
McHugh J, Widom J. Query optimization for XML. In: Proceedings of the 25th International Conference on Very Large Data Bases; 1999. p. 315–26.
Google Scholar
Milo T, Suciu D. Index structures for path expressions. In: Proceedings of the 7th International Conference on Database Theory; 1999. p. 277–95.
Google Scholar
Rao P, Moon B. PRIX: indexing and querying XML using Pruffer sequences. In: Proceedings of the 20th International Conference on Data Engineering; 2004. p. 288.
Google Scholar
Tatarinov I, Viglas S, Beyer K, Shanmugasundaram J, Shekita E, Zhang C. Storing and querying ordered XML using a relational database system. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2002. p. 204–15.
Google Scholar
Valduriez P. Join indices. ACM Trans Database Syst. 1987;12(2):218–46.
Article Google Scholar
Wang H, Park S, Fan W, Yu P. ViST: a dynamic index method for querying XML data by tree structures. In: Proceedings of the ACM SIGMOD International Conference on Management of Data; 2003. p. 110–21.
Google Scholar
Yoshikawa M, Amagasa T, Shimura T, Uemura S. XRel: a path-based approach to storage and retrieval of XML documents using relational databases. ACM Trans Internet Technol. 2001;1(1):110–41.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Amazon, Seattle, WA, USA
Xin Luna Dong
AT&T Labs – Research, AT&T, Bedminster, NJ, USA
Divesh Srivastava

Authors

Xin Luna Dong
View author publications
You can also search for this author in PubMed Google Scholar
Divesh Srivastava
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xin Luna Dong .

Editor information

Editors and Affiliations

Georgia Institute of Technology College of Computing, Atlanta, GA, USA
Ling Liu
University of Waterloo School of Computer Science, Waterloo, ON, Canada
M. Tamer Özsu

Section Editor information

Laboratoire d'Informatique de Grenoble, CNRS and LIG, Grenoble, France
Sihem Amer-Yahia

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Dong, X.L., Srivastava, D. (2018). XML Indexing. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_779

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8265-9_779
Published: 07 December 2018
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering

Publish with us

Policies and ethics