- From: Michael Rys <mrys@microsoft.com>
- Date: Fri, 5 Dec 2003 11:57:27 -0800
- To: "David Carlisle" <davidc@nag.co.uk>, <public-qt-comments@w3.org>
I think the indented meaning of the remark about insignificant white space is: Boundary whitespace (maximal character information sequence that only consist of whitespace) may be considered insignificant by fn:doc() if xml:space is not set to "preserve". So this is independent of the element content whitespace notion and inline with my interpretation of xml:space="default" handling of XML 1.0 (the data model generation being an application that takes xml:space into account). And I am (obviously) strongly opposed to changing the semantics for XSLT 2.0 in a non-backwards compatible way. Especially if it starts impacting the data model. Also note that the section you comment on is not specific to fn:doc() or XSLT 2.0. If you want to push for a change, please propose it as part of the semantics of fn:doc(). Best regards Michael > -----Original Message----- > From: public-qt-comments-request@w3.org [mailto:public-qt-comments- > request@w3.org] On Behalf Of David Carlisle > Sent: Friday, December 05, 2003 8:30 AM > To: public-qt-comments@w3.org > Subject: [DM] white space > > > > > The doc() function in F&O (and indirectly the document() function in > XSLT) specify that if the representation of a resource returned from > some URI is an XML file then the input tree should be constructed as > specified in DM, modulo some specific implementation dependent features > such as which uri schemes are supported. > > In DM it says: > > 6.7.3 Construction from an Infoset > > Applications may construct text nodes in the data model to represent > insignificant white space. This decision is considered outside the scope > of the data model, consequently the data model makes no attempt to > control or identify if any or all insignificant white space is ignored > > > This appears to be contradictory. Unless the document has been validated > (and so some element is known not to have mixed content) all space is > significant. But this is describing building a datamodel from the > infoset not from the PSVI, so it hasn't been schema validated at least, > and I'm not sure if the DM really takes note of DTD validation as > currently written. > > > The only occurrence of the word "significant" in the infoset document is > > White space within start-tags (other than significant white space in > attribute values) and end-tags. > > which clearly is irrelevant here. > > > In current XSLT1 applications more or less the only significant > incompatibility between implementations (baring bugs) is msxsl's > tendency to drop spaces. (If called from an API a more conforming > behaviour can be specified, but notably _not_ if called via the > xml-stylesheet PI) This means that the (in most ways excellent) msxsl > implementation will render an xml fragment such as > <p><b>Bold</b> <span>words</span> <i>italic</i></p> > as > Boldwordsitalic > if given an "identity transform" to html as it will decide that > inter-word spaces are insignificant. Arguably this is conformant (if > confusing) behaviour as XSLT/XPath 1 said essentially nothing about how > the > tree should be built. I believe that in version 2 of the language it is > clear that the wording should be clarified so that this unfortunate loss > of interoperabiliy (and usability) is clearly not allowed without some > specific user-option that requests it. > > > I fear that the wording in 6.7.3 was intended to authorise the dropping > of the interword spaces in my <p> example. It fails to do that as > it refers to a term "insignificant white space" that is apparently > undefined, however I believe that the comment should be deleted rather > than fixed. It is an unnecessary optional clause to stop > interoperability, systems storing documents in efficient database > storage forms can construct the data model instance in any way they > like, there is no need to allow systems that are parsing explict XML > documents to have the same flexibility. > > > there is some discussion of this on xml-dev > > http://lists.xml.org/archives/xml-dev/200307/msg00148.html > > (and any number of posts on xsl-list where users have fallen into this > trap and asking where their spaces went, or why some node count that > went 1,2,3 on msxsl goes 2,4,6 on every other processor) > > > David > > ________________________________________________________________________ > This e-mail has been scanned for all viruses by Star Internet. The > service is powered by MessageLabs. For more information on a proactive > anti-virus service working around the clock, around the globe, visit: > http://www.star.net.uk > ________________________________________________________________________ >
Received on Friday, 5 December 2003 14:57:45 UTC