- From: <bugzilla@wiggum.w3.org>
- Date: Sat, 07 May 2005 15:44:45 +0000
- To: public-qt-comments@w3.org
- Cc:
http://www.w3.org/Bugs/Public/show_bug.cgi?id=1309
Summary: white space in the DM
Product: XPath / XQuery / XSLT
Version: Last Call drafts
Platform: PC
OS/Version: Windows XP
Status: NEW
Severity: normal
Priority: P2
Component: Data Model
AssignedTo: Norman.Walsh@Sun.COM
ReportedBy: davidc@nag.co.uk
QAContact: public-qt-comments@w3.org
Some of the following issues have been raised on earlier drafts but it
seems safest to raise them again as last call issues in bugzilla.
6.7.3 Construction [of text nodes] from an Infoset
says
If the resulting Text Node consists entirely of white space and the
Text Node occurs in Element content[XML], the content of the Text Node
is the zero-length string.
The reference to Element Content XML production is inappropriate as
the input to this procedure is an infoset rather than a literal XML
document. The [element content whitespace] infoset property is flagged
a few lines up as being optionally used so this could say
If the resulting Text Node consists entirely of characters with an
[element content whitespace] property with value true, the content
of the Text Node is the zero-length string.
This would make the document consistent however (with either wording)
this clause introduces a very large incompatibility with XPath1.
I think it would be better to drop this clause altogether, systems
requiring white space nodes to be dropped can use the PSVI mapping
or a proprietary mapping to the datamodel, neither of which have any
xpath1 compatiblity implications.
Dropping white space from declared element content from schema
validated (PSVI) input makes sense and is something that could be
tested in a conformance test. Dropping white space from the infoset
mapping if [element content whitespace] is reported isn't really
testable as non validating parsers may or may not report this
and don't need to document whether they do or they don't.
As it is it means that given
<!DOCTYPE x [
<!ELEMENT x (x*)>
]>
<x>
<x/>
<x/>
</x>
a simple xpath of /x/node()[2] is completely undefined: it may pick up
the the first or the second empty x node.
If this clause is kept it should be higlighted here that it is
incompatible with Xpath1's data model and the XPath (and XSLT)
Compatability appendices should also mention this.
For the reverse mapping
6.7.5 (and J7) states that all characters get mapped to infoset items
with [element content whitespace] of unknown.
The infoset has a constraint that all non-white characters have a
value of false for this property
http://www.w3.org/TR/xml-infoset/#infoitem.character
says: ..It is always false for characters that are not white space.
So I think the mapping from the DM to the infoset should set this
property to false or to unknown depending on whether the character is
white space.
David
Received on Saturday, 7 May 2005 15:44:56 UTC