- From: <bugzilla@wiggum.w3.org>
- Date: Thu, 19 Jan 2006 14:04:25 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=2729 Summary: Whitespace text nodes Product: XPath / XQuery / XSLT Version: Candidate Recommendation Platform: PC OS/Version: Windows XP Status: NEW Severity: normal Priority: P2 Component: Data Model AssignedTo: Norman.Walsh@Sun.COM ReportedBy: davidc@nag.co.uk QAContact: public-qt-comments@w3.org This is essentially a re-raising of bug #1309 which was explicitly deferred for comment on the CR drafts. I agree with the requirement to strip white space text nodes in trees built from schema-validated input. This report just concerns the default mapping from a non schema-validated infoset. The requirement to strip white space text nodes from elements declared in a DTD introduces a large incompatibility between XPath 1 and XPath 2. This incompatibility is highlighted in the XSLT draft (J.1.1) but not in the XPath draft. If no changes are made to the specification to remove the incompatibility then similar wording to XSLT J.1.1 should be added to XPath I.1, as otherwise the small list of edge cases in appendix I.1 gives a rather over-optimistic view of the compatibility between the two versions. However, perhaps even more important than the compatibility between XPath 1 and XPath2, is compatibility between XPath2 (and XQuery) systems. The current requirement makes such compatibility rather hard to achieve. Typically a system will document which XML parser it uses, or give the user a choice of which to use, or give a choice of whether to use the parser in non-validating or validating mode. If a validating parser is used, the [element content whitespace] property will be reported, so in this case, all XPath2 (and XQuery) systems will act in the same way (although in a way incompatible with XPath1, this would be something I could "live with" (in W3C working group consensus-speak). However traditionally the most common type of parser used with XSLT (in particular) has been a non-validating-parser-which-reads-a-dtd (as the structure of the XSLT language means that this type of parser is more or less required to read the XSLT file, and typically the same parser is used on input documents). For this kind of parser there is, as far as I can tell, no specification at all, which suggests whether they should, or should not, report the [element content whitespace] property on elements for which they have read a DTD declaration. So typically a user will have no way of knowing whether or not white space will be stripped and no way of changing the behaviour if it is unwanted. Incompatibility with XPath1 is something that will hopefully become less important over time, but incompatibility between different XPath2/XQuery systems is something that should be avoided if at all possible. I offer 3 options A: Do not change the specification. In this case, the XPath compatibility appendix should document the incompatibility. B. Change the requirement to strip white space nodes so that it only applies to infosets constructed by a _validating_ XML parser. (DTD validated, so that if you validate with a DTD, the whitespace behaviour matches that of schema validation). C. Remove the requirement to strip white space when building from an Infoset (keeping it in the case of building from a PSVI) The status quo (A) has the largest incompatibility with Xpath 1 and introduces similarly large incompatibilities between Xquery and XPath2 systems running on different XML parsers. Taking either option (B) or (C) would cause all XPath2 and XQuery systems to work the same way. Option (C) is the most compatible with XPath1, and the one that I personally prefer, but perhaps option (B) would be a useful compromise position that should be considered. David
Received on Thursday, 19 January 2006 14:04:27 UTC