Skip to main content

Document Representations (Inclusive Native and Relational)

  • Living reference work entry
  • First Online:
Encyclopedia of Database Systems
  • 32 Accesses

Synonyms

Documents; Markup languages; Page representations; Semi-structured data

Definition

Native document representations are file formats designed for documents. They can be roughly divided into three types: page-oriented, stream-oriented, and tree-structured. Hybrid types can also be found. Within each type, document representations range from the simple to the complex. All native representations assume an implicit order of the document’s information, reflecting the linear reading order of conventional documents. The most important document representation is the Extensible Markup Language (XML), which is tree-structured and can have any level of complexity. It is seeing widespread use on the Web and in business and is also popular for non-document applications.

Relational databases use a variety of document representations that map to a native representation. Page-oriented and stream-oriented documents are best stored in a coarse-grained manner and do not appear to have stimulated...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Recommended Reading

  1. Adobe Systems Incorporated. PDF reference. 6th edn. 2006.

    Google Scholar 

  2. Boag S, Chamberlin D, Fernández MF, Florescu D, Robie J, Siméon J. XQuery 1.0: an XML query language. Tokyo: World Wide Web Consortium (W3C); 2007.

    Google Scholar 

  3. Bray T, Paoli J, Sperberg-McQueen CM, Maler E, Yergeau F. Extensible markup language (XML) 1.0. World Wide Web Consortium (W3C). 4th edn. 2006.

    Google Scholar 

  4. Draper D. Mapping between XML and relational data. In: XQuery from the experts: a guide to the W3C XML query language. Chap. 6. Addison Wesley; 2003.

    Google Scholar 

  5. Fallside DC, Walmsley P. XML schema part 0: primer. World Wide Web Consortium (W3C). 2nd edn. 2004.

    Google Scholar 

  6. Furuta R, Scofield J, Shaw A. Document formatting systems: survey, concepts, and issues. ACM Comput Surv. 1982;14(3):417–72.

    Article  Google Scholar 

  7. Goldfarb CF, editor. Information processing – text and office systems – Standard Generalized Markup Language (SGML), International Standard ISO 8879. Geneva: International Organization for Standardization; 1986.

    Google Scholar 

  8. Kay M. XSL transformations (XSLT) version 2.0. World Wide Web Consortium (W3C). 2007.

    Google Scholar 

  9. Knuth DE, Plass MF. Breaking paragraphs into lines. Softw Pract Exp. 1982;11(11):1119–84.

    Article  MATH  Google Scholar 

  10. Microsoft Office Word. 2007 Rich Text Format (RTF) specification. 2007. Version 1.9. Downloaded from microsoft.com, November 2007.

  11. OASIS. Open document format for office applications (OpenDocument) v1.1. 2007. http://docs.oasis-open.org/office/v1.1/OS/. 2007.

  12. Shanmugasundaram J, Shekita E, Barr R, Carey M, Lindsay B, Pirahesh H, Reinwald B. Efficiently publishing relational data as XML documents. VLDB J. 2001;10(2–3).

    Google Scholar 

  13. Simske SJ, Baggs SC. Digital capture for automated scanner workflows. Proc. 2004 ACM Symposium on Document Engineering; 2004. p. 171–7.

    Google Scholar 

  14. Tatarinov I, Viglas SD, Beyer K, Shanmugasundaram J, Shekita E, Zhang C. Storing and querying ordered XML using a relational database system. Proc. ACM SIGMOD International Conference on Management of Data; 2002. p. 204–15.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ethan V. Munson .

Editor information

Editors and Affiliations

Section Editor information

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Science+Business Media LLC

About this entry

Cite this entry

Munson, E.V. (2017). Document Representations (Inclusive Native and Relational). In: Liu, L., Özsu, M. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4899-7993-3_138-2

Download citation

  • DOI: https://doi.org/10.1007/978-1-4899-7993-3_138-2

  • Received:

  • Accepted:

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4899-7993-3

  • Online ISBN: 978-1-4899-7993-3

  • eBook Packages: Springer Reference Computer SciencesReference Module Computer Science and Engineering

Publish with us

Policies and ethics