Friday, September 26, 2008

Week 5 Readings

Bryan's Introduction to XML
  • XML helps join documents, adds editorial comments, places images within text files
  • XML is not standardized or a predefined set of tags
  • Documents made up of entities, made up of elements, made up of attributes
  • Unique identifiers provide cross references between two document points
  • Text entities are shorthand for a full name; this makes for more efficient document editing (I assume it saves typing time, too)
  • XML documents are best stored in databases

Some information in this article about tag sets is confusing.

Extending Your Markup: An XML Tutorial
  • XML: tells about content, "a semantic language that lets you meaningfully annotate text"
  • Ideal XML document starts with prolog and has one element
  • Prolog: XML version + standalone (yes or no) + encoding + DTD declaration
  • Element = root of the document, can be nonterminal or terminal
  • DTDs: define document structure, specify tag sets, specify tag order, specify tag attributes, can be in XML document or separate
  • Element attributes: not required; can be optional, required or fixed
  • Namespace: avoids confusion between names
  • XML schema and DTDs are still being perfected
After reading this article, information about DTD attributes, XML schema, and extending capabilities is still unclear to me.

W3Schools XML Schema Tutorial
  • XML Schema (XSD) can be used instead of DTDs and describes XML document structure
  • XSD defines elements, child elements, and attributes
  • Why is XSD preferable to DTDs? They are extensible to future additions, support data types and support namespaces.
  • XSD supports crosscultural communication because it ensures standard data types (i.e.: date formats of YYYY-MM-DD)
  • When elements or attributes have defined data types, invalid types will not be accepted
  • Facets = restrictions on XML elements (i.e.: initials field can contain only 3 uppercase letters)
  • Seven indicators define order, occurrence and group

This tutorial mentions several data types. I understand date, time and decimal types, but I would like more clarification on string types. Does string just refer to basic text (not numbers, etc.)?




No comments: