- From: Nils Klarlund <klarlund@research.att.com>
- Date: Tue, 8 Feb 2000 15:32:03 -0500
- To: <xsl-editors@w3.org>
- Cc: "klarlund" <klarlund@research.att.com>
I believe that the way CDATA sections are treated in XPATH/XSLT is not compatible with the latest Errata to XML 1.0. (http://www.w3.org/XML/xml-19980210-errata). Moreover, the way CDATA sections are treated makes it impossible to adopt a simple view of XML, namely remove all whitespaces nodes, without a provable loss of expressive power! This radical pruning view is desirable for many applications, especially for database applications, but, also for document oriented processing, where the usual semantics that introduce tons of whitespace nodes is an aesthetic and practical problem. The problem is that even a very explicitly marked whitespace such as <![CDATA[ ]]> is eaten up if not in company with non-whitespace characters. So, I can't insert spaces between nodes! In other words, assuming that it is unreasonable that a DTD or application should make decisions about which whitespace nodes are for real and which are not, I'm in trouble: I want to prune all whitespace nodes, except those that I mark as important. Clearly, as indicated, in the section below, XML 1.0 makes semantic distinctions between ' ' and <![CDATA[ ]]>. Thus, XSLT cannot be used to determine whether some content is "element content". Does it appear in error to water down XPATH to that point? I suggest that the stripping of whitespace nodes explicitly excludes nodes gotten from or involving CDATA sections. Thanks /Nils From Errata: Section 3 Change item number 2 of the list of valid cases for the "Element Valid" VC to read: The declaration matches children and the sequence of child elements belongs to the language generated by the regular expression in the content model, with optional white space (characters matching the nonterminal S) between the start tag and the first child element, between child elements or between the last child element and the end tag. Note that a CDATA section containing only white space does not match the nonterminal S, and hence cannot appear in these positions.
Received on Tuesday, 8 February 2000 15:37:50 UTC