RE: Latest version of the Infoset from Richard Tobin on 2001-03-28 (www-xml-infoset-comments@w3.org from January to March 2001)

From: Richard Tobin <richard@cogsci.ed.ac.uk>
Date: Thu, 29 Mar 2001 00:17:59 +0100 (BST)
To: www-xml-infoset-comments@w3.org
Cc: JBoyer@PureEdge.com
Message-Id: <200103282317.f2SNHxE06776@spottisvax.nonet>

The I18N problem with CDATA sections is that a character that can
appear in CDATA sections in one encoding my not be able to in another
(because it doesn't exist and would have to be a character reference).
The I18N group raised this issue in (member only)

  http://lists.w3.org/Archives/Member/w3c-xml-core-wg/2000OctDec/0298.html

As far as authoring tools are concerned, the Infoset is just not
sufficient for all the things such a tool may need.  We have to draw
the line somewhere (a document might well be ill-formed during its
creation, for example).  The exclusion of CDATA section markers from
the Infoset should not be taken in any way as preventing tools that
need to process CDATA sections from doing so - they just don't get
the necessary terminology from the Infoset.

You are right that entities were expanded between the start and end
markers.

As far as I remember the XInclude issue concerned the handling of
ranges that cross entity boundaries - should the boundaries be
fixed up?  If so, they no longer reflect the actual entity content;
if not, they are unbalanced.  XInclude could just have removed entity
boundaries, but I believe that this would be a common problem and
quite likely more specifications would have to take trouble to
say that they delete them than would use them.

The XML Query group also requested the removal of CDATA and entity
boundary markers.  See the disposal of comments at

  http://www.w3.org/2001/03/infoset-disposition

Your phrase "XML InfoSet be prevented from providing information"
is not really consistent with the purpose of the Infoset.  It does
not require or prohibit the provision of information, it just provides
a common vocabulary for specifications to refer to that information
with.  It was essential to our decisions on both CDATA sections and
entities that we did not expect many future standards to refer to
these.  If it turns out that we were wrong, they can always be
added in a future revision.

-- Richard

Received on Wednesday, 28 March 2001 18:18:08 UTC