Abstract
Data warehousing is a collection of concepts and tools which aim at providing and maintaining a set of integrated data (the data warehouse – DW ) for business decision support within an organization. They extract data from different operational data sources, and after some cleansing and transformation procedures data are integrated and loaded into a central repository to enable analysis and mining. Data and metadata lineage are important processes for data analysis. The first allows users to trace warehouse data items back to the original source item from which they were derived and the latter shows which operations have been performed to achieve that target data. This work proposes integrating metadata captured during transformation processes using the CWM metadata standard in order to enable data and metadata lineage. Additionally it presents a tool specially developed for performing this task.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Abiteboul, S., Cluet, S., Milo, T.: Tools for Data Translation and Integration. IEEE Data Engineering Bulletin 22(1), 3–8 (1999)
Bernstein, P., Bergstraesser, T.: Meta-Data Support for Data Transformations Using Microsoft Repository. IEEE Data Engineering Bulletin 22(1), 19–24 (1999)
Brodsky, A., Gary, D., Timothy, G.: Mastering XMI – Java Programming with XMI. XML and UML. OMG Press (2002)
Cui, Y.: Lineage Tracing for Data Warehouse.PhD Thesis. Department of Computer Science of Stanford University (December 2001)
Cui, Y.: Tracing the Lineage of View Data in a Warehousing Environment. ACM TODS 25(2), 179–227 (2000)
Cui, Y., Widom, J.: Lineage Tracing for General Data Warehouse Transformations. In: Proc.of 27th International Conference on Very Large Data Bases, Roma (September 2001)
Galhardas, H., Florescu, D., et al.: Improving Data Cleaning Quality Using a Data Lineage Facility. In: DMDW 2001 - Proceedings of the International Workshop on Design and Management of Data Warehouses, Interlaken, Switzerland (2001)
Hachem, N.I., Qiu, K., Gennert, M.: Managing Derived Data in the Gaea Scientific DBMS. In: VLDB, Dublin-Ireland, August 1993, pp. 1–12 (1993)
Jarke, M., Vassiliadis, P., et al.: Architecture and Quality in Data Warehouses. In: Pernici, B., Thanos, C. (eds.) CAiSE 1998. LNCS, vol. 1413, p. 93. Springer, Heidelberg (1998)
Jarke, M., Vassiliadis, P., et al.: A Model for Data Warehouse Operational Processes. In: Wangler, B., Bergman, L.D. (eds.) CAiSE 2000. LNCS, vol. 1789, p. 446. Springer, Heidelberg (2000)
Lee, T., Bressan, S., Madnick, S.: Source Attribution for Querying against Semi-Structured Documents. In: Workshop on Web Information and Data Management, November 1998, pp. 33–39 (1998)
OMG – CWM Specification, http://www.omg.org/technology/cwm
Oracle Warehouse Builder, http://www.oracle.com
Poole, J., Chang, D., Tolbert, D.: Common Warehouse Metamodel – An Introduction to the Standard for Data Warehouse Integration. OMG Press (2002)
Santana, S.: A Tool for Supporting Transformation and Data Lineage in Data Warehouse Environment (in Portuguese). Master Thesis. IME-RJ (June 2003)
Squire, C.: Data Extraction for the Data Warehouse. In: ACM SIGMOD International Conference on Management of Data, May 1995, pp. 446–447 (1995)
Staudt, M., Vaduva, A., et al.: Metadata Management and Data Warehousing. In: DMDW 1999 Proceedings of the International Workshop on Design and Management of Data Warehouses, Heidelberg, Germany (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Santana, A.S., de Carvalho Moura, A.M. (2004). Metadata to Support Transformations and Data & Metadata Lineage in a Warehousing Environment. In: Kambayashi, Y., Mohania, M., Wöß, W. (eds) Data Warehousing and Knowledge Discovery. DaWaK 2004. Lecture Notes in Computer Science, vol 3181. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30076-2_25
Download citation
DOI: https://doi.org/10.1007/978-3-540-30076-2_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22937-7
Online ISBN: 978-3-540-30076-2
eBook Packages: Springer Book Archive