Abstract
Nowadays, huge volumes of molecular biological data are available from different biological research projects. This data often covers overlapping and complemental domains. For instance, the Swiss-Prot database merely contains protein sequences along with their annotations, whereas the KEGG database incorporates enzymes, metabolic pathways and genome data. Due to the fact that this data complements and completes each other, it is desirable to gain a global view on the integrated databases instead of browsing each single data source itself.
Unfortunately, most data sources are queried through proprietary interfaces with restricted access and typically support only a small set of simple query operations. Apart from minor exceptions, there is no common data model or presentation standard for the query results. Consequentially, the integration of manifold heterogeneous, distributed databases has become a typical, yet challenging task in bioinformatics. Within this paper, we introduce our own approach called “BioDataServer” which is a user-adaptable integration, storage, analysis and query service for molecular biological data targeted at commercial customers.
This work was supported by the German Ministry of Education and Research (BMBF) under grant number 0310621.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Achard, F., Vaysseix, G., Barillot, E.: XML, bioinformatics and data integration. Bioinformatics 17(2), 115–125 (2001)
Boeckmann, B., Bairoch, A., Apweiler, R., Blatter, M.-C., Estreicher, A., Gasteiger, E., Martin, M.J., Michoud, K., O’Donovan, C., Phan, I., Pilbout, S., Schneider, M.: The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Research 31(1), 365–370 (2003)
Etzold, T., Ulyanow, A., Argos, P.: SRS: Information Retrieval System for Molecular Biology Data Banks. Methods in Enzymology 266, 114–128 (1996)
Freier, A., Hofestädt, R., Lange, M., Scholz, U., Stephanik, A.: BioDataServer: A SQL-based service for the online integration of life science data. Silico Biology 2(0005) (2002), Online Journal: http://www.bioinfo.de/isb/2002/02/0005/
Haas, L.M., Schwarz, P.M., Kodali, P., Kotlar, E., Rice, J.E., Swope, W.C.: DiscoveryLink: A system for integrated access to life sciences data sources. IBM Systems Journal 40(2), 489–511 (2001)
Hamosh, A., Scott, A.F., Amberger, J., Bocchini, C., Valle, D., McKusick, V.A.: Online Mendelian Inheritance in Man (OMIM), a knowledgebase of human genes and genetic disorders. Nucleic Acids Research 30(1), 52–55 (2002)
Höding, M.: Methoden und Werkzeuge zur systematischen Integration von Dateien in Föderierte Datenbanksysteme. Shaker Verlag, Aachen (2000)
Inmon, W.H.: Building the Data Warehouse, 2nd edn. John Wiley & Sons, Inc, Chichester (1996)
Kanehisa, M., Goto, S., Kawashima, S., Nakaya, A.: The KEGG database at GenomeNet. Nucleic Acids Research 30(1), 42–46 (2002), http://www.genome.ad.jp/kegg/
Karp, P.D.: A Strategy for Database Interoperation. Journal of Computational Biology 2(4), 573–586 (1995)
Prestwich, S., Bressan, S.: A SAT Approach to Query Optimization in Mediator Systems. In: Proceedings of the Fifth International Symposium on the Theory and Applications of Satisfiability Testing (SAT 2002), Cincinnati, Ohio,USA, pp. 252–259 (2002)
Schomburg, I., Chang, A., Schomburg, D.: BRENDA, enzyme data and metabolic information. Nucleic Acids Research 30(1), 47–49 (2002)
Sheth, A.P., Larson, J.A.: Federated Database Systems for Managing Distributed, Heterogeneous, and Autonomous Databases. ACM Computing Surveys 22(3), 183–236 (1990)
Siepel, A., Farmer, A., Tolopko, A., Zhuang, M., Mendes, P., Beavis, W., Sobral, B.: ISYS: a decentralized, component-based approach to the integration of heterogeneous bioinformatics resources. Bioinformatics 17(1), 83–94 (2001)
Stevens, R., Baker, P., Bechhofer, S., Ng, G., Jacoby, A., Paton, N.W., Goble, C.A., Brass, A.: TAMBIS: Transparent Access to Multible Bioinformatics Information Sources. Bioinformatics 16(4), 184–185 (2000)
Tatusova, T.A., Karsch-Mizrachi, I., Ostell, J.A.: Complete genomes inWWW Entrez: data representation and analysis. Bioinformatics 15(7/8), 536–543 (1999)
Wiederhold, G.: Mediators in the Architecture of Future Information Systems. IEEE Computer 25(3), 38–49 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Balko, S., Lange, M., Schnee, R., Scholz, U. (2004). BioDataServer: an Applied Molecular Biological Data Integration Service. In: Rahm, E. (eds) Data Integration in the Life Sciences. DILS 2004. Lecture Notes in Computer Science(), vol 2994. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24745-6_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-24745-6_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-21300-0
Online ISBN: 978-3-540-24745-6
eBook Packages: Springer Book Archive