Skip to main content

Dependency-Preserving Normalization of Relational and XML Data

  • Conference paper
Database Programming Languages (DBPL 2005)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 3774))

Included in the following conference series:

Abstract

Having a database design that avoids redundant information and update anomalies is the main goal of normalization techniques. Ideally, data as well as constraints should be preserved. However, this is not always achievable: while BCNF eliminates all redundancies, it may not preserve constraints, and 3NF, which achieves dependency preservation, may not always eliminate all redundancies.

Our first goal is to investigate how much redundancy 3NF tolerates in order to achieve dependency preservation. We apply an information-theoretic measure and show that only prime attributes admit redundant information in 3NF, but their information content may be arbitrarily low.

Then we study the possibility of achieving both redundancy elimination and dependency preservation by a hierarchical representation of relational data in XML. We provide a characterization of cases when an XML normal form called XNF guarantees both.

Finally, we deal with dependency preservation in XML and show that like in the relational case, normalizing XML documents to achieve non-redundant data can result in losing constraints. By modifying the definition of XNF, we define another normal form for XML documents, X3NF, that generalizes 3NF for the case of XML and achieves dependency preservation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Abiteboul, S., Hull, R., Vianu, V.: Foundations of Databases. Addison-Wesley, Reading (1995)

    MATH  Google Scholar 

  2. Arenas, M., Fan, W., Libkin, L.: On Verifying Consistency of XML Specifications. In: PODS 2002, pp. 259–270 (2002)

    Google Scholar 

  3. Arenas, M., Libkin, L.: A Normal Form for XML Documents. In: PODS 2002, pp. 85–96 (2002)

    Google Scholar 

  4. Arenas, M., Libkin, L.: An Information-Theoretic Approach to Normal Forms for Relational and XML Data. J. ACM 52(2), 246–283 (2005)

    Article  MathSciNet  Google Scholar 

  5. Beeri, C., Bernstein, P.A., Goodman, N.: A Sophisticate’s Introduction to Database Normalization Theory. In: VLDB 1978, pp. 113–124 (1978)

    Google Scholar 

  6. Buneman, P., Davidson, S., Fan, W., Hara, C.S., Tan, W.: Reasoning about Keys for XML. Inf. Syst. 28(8), 1037–1063 (2003)

    Article  Google Scholar 

  7. Buneman, P., Davidson, S.B., Fan, W., Hara, C.S., Tan, W.: Keys for XML. In: WWW 2001, pp. 201–210 (2001)

    Google Scholar 

  8. Chen, Y., Davidson, S., Hara, C., Zheng, Y.: RRXS: Redundancy Reducing XML Storage in Relations. In: VLDB 2003, pp. 189–200 (2003)

    Google Scholar 

  9. Embley, D.W., Mok, W.Y.: Developing XML Documents with Guaranteed “Good” Properties. In: Kunii, H.S., Jajodia, S., Sølvberg, A. (eds.) ER 2001. LNCS, vol. 2224, pp. 426–441. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  10. Fan, W., Kuper, G.M., Siméon, J.: A Unified Constraint Model for XML. In: WWW 2001, pp. 179–190 (2001)

    Google Scholar 

  11. Fan, W., Libkin, L.: On XML Integrity Constraints in the Presence of DTDs. In: PODS 2001, pp. 114–125 (2001)

    Google Scholar 

  12. Fan, W., Siméon, J.: Integrity Constraints for XML. In: PODS 2000, pp. 23–34 (2000)

    Google Scholar 

  13. Hartmann, S., Link, S.: More Functional Dependencies for XML. In: Kalinichenko, L.A., Manthey, R., Thalheim, B., Wloka, U. (eds.) ADBIS 2003. LNCS, vol. 2798, pp. 355–369. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  14. Lee, D., Chu, W.W.: Constraints-Preserving Transformation from XML Document Type Definition to Relational Schema. In: Laender, A.H.F., Liddle, S.W., Storey, V.C. (eds.) ER 2000. LNCS, vol. 1920, pp. 323–338. Springer, Heidelberg (2000)

    Chapter  Google Scholar 

  15. Lee, M., Wang Ling, T., Lup Low, W.: Designing Functional Dependencies for XML. In: Jensen, C.S., Jeffery, K., Pokorný, J., Šaltenis, S., Bertino, E., Böhm, K., Jarke, M. (eds.) EDBT 2002. LNCS, vol. 2287, pp. 124–141. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  16. Vincent, M., Liu, J.: Functional Dependencies for XML. In: Zhou, X., Zhang, Y., Orlowska, M.E. (eds.) APWeb 2003. LNCS, vol. 2642, pp. 22–34. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  17. Vincent, M.W., Liu, J., Liu, C.: Redundancy Free Mappings from Relations to XML. In: Li, Q., Wang, G., Feng, L. (eds.) WAIM 2004. LNCS, vol. 3129, pp. 346–356. Springer, Heidelberg (2004)

    Chapter  Google Scholar 

  18. Vincent, M.W., Liu, J., Liu, C.: Strong Functional Dependencies and Their Application to Normal Forms in XML. ACM TODS 29(3), 445–462 (2004)

    Article  Google Scholar 

  19. Wang, J., Topor, R.W.: Removing XML Data Redundancies Using Functional and Equality-Generating Dependencies. In: Australian Database Conference, pp. 65–74 (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kolahi, S. (2005). Dependency-Preserving Normalization of Relational and XML Data. In: Bierman, G., Koch, C. (eds) Database Programming Languages. DBPL 2005. Lecture Notes in Computer Science, vol 3774. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11601524_16

Download citation

  • DOI: https://doi.org/10.1007/11601524_16

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-30951-2

  • Online ISBN: 978-3-540-31445-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics