[DM] 7.7.3 Construction (of text nodes) from an Infoset from David Carlisle on 2004-07-29 (public-qt-comments@w3.org from July 2004)

From: David Carlisle <davidc@nag.co.uk>
Date: Thu, 29 Jul 2004 16:11:41 +0100
To: public-qt-comments@w3.org
Message-Id: <200407291511.QAA26873@penguin.nag.co.uk>

I commented here:

http://lists.w3.org/Archives/Public/public-qt-comments/2003Dec/0085.html

On an earlier draft of DM that the handling of white space was
dangerously (explicitly) undefined.

That part of the document has changed out of all recognition
but the relevant details appear now to be in section 
7.7.3 on constructing a text node from an infoset.

Section 7.7 leaves me very confused.

It starts by saying:

    Text Nodes must satisfy the following constraint:

    If the parent of a text node is not empty, the Text Node must not
    contain the empty string as its content.

which is fine

but

7.7.3 Construction from an Infoset

says

   If the resulting Text Node consists entirely of white space and the
   Text Node occurs in Element contentXML, the content of the Text Node
   is the empty string.

If the element is in Element content then it (presumably) has some
element as its parent, so being the empty string would invalidate the
quoted constraint.

Perhaps what was meant was that 

   If the resulting Text Node would consist entirely of white space and
                              ^^^^^
   occurs in Element contentXML, then no node is constructed.


However this would still be confusing.

It says "The Text Node occurs in Element contentXML" but the Text node
is the thing that's being constructed in the instance of this data
model. the XML spec says nothing about that. What is presumably meant is 
that the element in the original XML document on which the infoset is
based has element content, but that can't be the case unless it was
parsed with a DTD validating parser.

If that is what is intended, that DTD-valid documents have white space
in declared element content removed, that is (I suppose) better than the
previous draft as it is at least definite, not left up to the
implementation, but it still introduces large differences in behaviour
between XPath 1 and Xpath 2 where by default this white space is always
seen.

It would be preferable for the DM never to remove such white space (by
default) and leave it to higher level switches such as <xsl:strip-space
to remove the nodes (whether or not they are in declared element content)

David (even more confused than usual:-)



________________________________________________________________________
This e-mail has been scanned for all viruses by Star Internet. The
service is powered by MessageLabs. For more information on a proactive
anti-virus service working around the clock, around the globe, visit:
http://www.star.net.uk
________________________________________________________________________

Received on Thursday, 29 July 2004 11:12:12 UTC