Abstract
Information Integration is the problem of providing a uniform access to multiple and heterogeneous data sources. The most common approach to this task, called global-as-view, consists in providing a global schema of the data, in which each relation is defined as a view over a set of data sources. Recent works deal with this problem in the case of limited source capabilities, where, in general, sources can only be accessed respecting certain binding patterns for their attributes. In this case, computing the answer to a user query over the global schema cannot be done by simply substituting the concepts appearing in the query with their definitions. Instead, it may require the evaluation of a suitable recursive Datalog program.
In this paper we study the evaluation of conjunctive queries in the global-as-view approach with limited source capabilities. We first present an algorithm for optimizing query answering which takes into account the structure of the query together with the binding patterns in order to compute an optimized query plan. The optimization allows for excluding from the query plan the sources that are not relevant for the answer. We then study online optimization of query answering by taking into account full inclusion and functional dependencies between sources. Such an optimization, at a certain step of the answering process, uses the dependencies together with the data retrieved so far to avoid unnecessary accesses to the sources.
The updated original online version for this book can be found at DOI: 10.1007/978-0-387-35614-3_21
Chapter PDF
Similar content being viewed by others
References
Abiteboul, S. and Duschka, O. (1998). Complexity of answering queries using materialized views. In Proc. of the 17th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’98), pages 254–265.
Abiteboul, S., Hull, R., and Vianu, V. (1995). Foundations of Databases. Addison Wesley Publ. Co., Reading, Massachussetts.
Beeri, C. and Bernstein, P. A. (1979). Computational problems related to the design of normal form relational schemas. ACM Trans. on Database Systems, 4 (1): 30–59.
Chawathe, S. S., Garcia-Molina, H., Hammer, J., Ireland, K., Papakonstantinou, Y., Ullman, J. D., and Widom, J. (1994). The TSIMMIS project: Integration of heterogeneous information sources. In Proc. of the 10th Meeting of the Information Processing Society of Japan (!PSJ’94), pages 7–18.
Cosmadakis, S. S. and Kanellakis, P. C. (1986). Functional and inclusion dependencies — A graph theoretical approach. In Kanellakis, P. C. and Preparata, F. P., editors, Advances in Computing Research, Vol. 3,pages 163–184. JAI Press.
Cosmadakis, S. S., Kanellakis, P. C., and Vardi, M. (1990). Polynomial-time implication problems for unary inclusion dependencies. J. of the ACM, 37 (1): 15–46.
Duschka, O. M. and Levy, A. Y. (1997). Recursive plans for information gathering. In Proc. Of the 15th Int. Joint Conf. on Artificial Intelligence (IJCAI’97), pages 778–784.
Florescu, D., Levy, A., and Mendelzon, A. (1998). Database techniques for the World-Wide Web: A survey. SIGMOD Record, 27 (3): 59–74.
Florescu, D., Levy, A. Y., Manolescu, I., and Suciu, D. (1999). Query optimization in the presence of limited access patterns. In Proc. of the ACM SIGMOD Int. Conf on Management of Data, pages 311–322.
Hull, R. (1997). Managing semantic heterogeneity in databases: A theoretical perspective. In Proc. of the 16th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’97).
Hull, R. and Zhou, G. (1996). A framework for supporting data integration using the materialized and virtual approaches. In Proc. of the ACM SIGMOD Int. Conf. on Management of Data, pages 481–492.
Levy, A. Y. (1999). Answering queries using views: A survey. Technical report, University of Washinghton.
Levy, A. Y., Mendelzon, A. O., Sagiv, Y., and Srivastava, D. (1995). Answering queries using views. In Proc. of the 14th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’95), pages 95–104.
Li, C. and Chang, E. (2000). Query planning with limited source capabilities. In Proc. of the 16th IEEE Int. Conf. on Data Engineering (ICDE 2000), pages 401–412.
Li, C. and Chang, E. (2001). On answering queries in the presence of limited access patterns. In Proc. of the 8th Int. Conf. on Database Theory (ICDT 2001 ), pages 219–233.
Maier, D. (1980). Minimum covers in the relational database model. J. of the ACM, 27 (4): 664–674.
Rajaraman, A., Sagiv, Y., and Ullman, J. D. (1995). Answering queries using templates with binding patterns. In Proc. of the 14th ACM SIGACT SIGMOD SIGART Symp. on Principles of Database Systems (PODS’95).
Ullman, J. D. (1997). Information integration using logical views. In Proc. of the 6th Int. Conf. on Database Theory (ICDT’97), volume 1186 of Lecture Notes in Computer Science, pages 19–40. Springer.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 IFIP International Federation for Information Processing
About this chapter
Cite this chapter
Calì, A., Calvanese, D. (2002). Optimized Querying of Integrated Data over the Web. In: Rolland, C., Brinkkemper, S., Saeki, M. (eds) Engineering Information Systems in the Internet Context. IFIP — The International Federation for Information Processing, vol 103. Springer, Boston, MA. https://doi.org/10.1007/978-0-387-35614-3_17
Download citation
DOI: https://doi.org/10.1007/978-0-387-35614-3_17
Publisher Name: Springer, Boston, MA
Print ISBN: 978-1-4757-5149-9
Online ISBN: 978-0-387-35614-3
eBook Packages: Springer Book Archive