W3C home > Mailing lists > Public > www-xml-infoset-comments@w3.org > January to March 2000

XML infoset: please don't

From: Nils Klarlund <klarlund@research.att.com>
Date: 18 Jan 2000 11:45:13 -0500
To: www-xml-infoset-comments@w3.org
Cc: klarlund@research.att.com
Message-ID: <yqz2d7qze3c6.fsf@fish-ha.research.att.com>

Dear working group members:

XML Information Set terminology unfortunately seems to be having
adverse effects.  I just started rereading the XML Schema draft and
choked right away on the sentence:

  "An element information item is the component of an infoset which
   corresponds to an element."

No one should be forced to write like that! Another example,

   "XML Schema: an XML element information item which, along with its
   descendants, satisfies all the Constraints on Schemas in this

This should have been: 

   "XML Schema: an element node which satisfies all the Constraints on

These and many more examples are solid road blocks to the furthering
of XML; personally, they don't make my blood boil, but among the
public, some are enraged (see recent mailings to comp.text.xml).

I then tried to comprehend what an element information item is by
reading the XML Information Set note.  Nothing really deep it turns
out: it's a node in a tree representation of an XML document.  My
objection is that there are now two (at least) different tree models
of XML: DOM and XML information sets.  They are both justified, but I
believe they should be unified in what is (or should be) an obvious

* DOM, being the finer model, is the starting point; the tree model is
  something any programmer can understand, and the most detailed one.

* DOM-I are trees gotten from trees in DOM by a mapping that convert
  CDATA to text and applies concatenate text nodes (by using
  normalize()) (and a couple of other tricks, more complicated it
  shouldn't be).

Canonical XML can now be explained by a simple transformation from

I would encourage that the XML Information Set be substantially
simplified.  Please put stakes through verbiage like "XML element
information item."  And, XML Information Set should be explainable in
one paragraph departing from DOM.  Then, make this paragraph a part of
DOM2 (along with canonical XML, perhaps).


Received on Tuesday, 18 January 2000 11:44:54 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 23:08:00 UTC