Abstract
Research in molecular biology is continuously producing an immense amount of data, but this information is spread over numerous heterogeneous data repositories. Their integration into a federated information system would drastically reduce the time a biologist has to spend browsing different WWW sites or databases in search for a particular piece of information.
In this study we point out the specific problems that molecular biology is posing to data integration. We present our approach to cope with these problems. It is based on a mediator architecture and uses query correspondence assertions (QCA) to describe sources in a flexible yet expressive manner. QCAs both capture content and query capabilities of arbitrary data sources with respect to a federated schema. Based on such QCAs a mediator can answer queries against the federated schema by constructing semantically equivalent combinations of source queries.
This research was supported by the German Research Society, Berlin-Brandenburg Graduate School in Distributed Information Systems (DFG grant no. GRK 316).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Literatur
Aho, A. V., Y. Sagiv, et al. (1979).“Equivalence among Relational Expressions.„ SIAM Journal of Computing 8 (2): 218–246.
Chen, I. A. and V. M. Markowitz (1995). “An Overview of the Object-Protocol Model (OPM) and OPM Data Management Tools.201E; Information Systems 20 (5): 393–418.
Etzold, T., A. Ulyanov, et al. (1996). “SRS: Information Retrieval System for Molecular Biology Data Banks.„ Methods in Enzymology 266: 114–128.
Hull, R. (1997). Managing Semantic Heterogeneity in Databases: A Theoretical Perspective. 16th ACM PODS.
Leser, U. (1998). Combining Heterogeneous Data Sources through Query Correspondence Assertions. 1st Workshop on Web Information and Data Management, Washington, D.C.
Leser, U. (1998). Maintenance and Mediation in Federated Databases. 8th WITS, Helsinki, Finland, to appear.
Leser, U., H. Lehrach, et al. (1998). “Issues in Developing Integrated Genomic Databases and Application to the Human X Chromosome.„ Bioinformatics 14 (7): 583–690.
Levy, A. Y., A. O. Mendelzon, et al. (1995). Answering Queries using Views. 14th ACM PODS, San Jose, CA pp. 95–104.
Levy, A. Y., A. Rajaraman, et al. (1996). Querying Heterogeneous Information Sources Using Source Descriptions. 22th VLDB, Bombay, India pp. 251–262.
Miller, R. J. (1998). Using Schematically Heterogenous Structures. ACM SIGMOD, Seattle, Washington pp. 189–200.
Naumann, F., J. C. Freytag, et al. (1998). Quality driven Source Selection using Data Envelopment Analysis. Int. Conf. on Information Quality, MIT, Cambridge.
Sheth, A. and J. A. Larson (1990). “Federated Database Systems for Managing Distributed, Heterogeneous and Autonomous Databases.„ ACM Computing Survey 22 (3).
Wiederhold, G. (1992). “Mediators in the Architecture of Future Information Systems.„ IEEE Computer 25 (3): 38–49.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 1999 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Leser, U. (1999). Designing a Global Information Resource for Molecular Biology (Short Paper). In: Buchmann, A.P. (eds) Datenbanksysteme in Büro, Technik und Wissenschaft. Informatik aktuell. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-60119-4_22
Download citation
DOI: https://doi.org/10.1007/978-3-642-60119-4_22
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65606-7
Online ISBN: 978-3-642-60119-4
eBook Packages: Springer Book Archive