- From: Ray Whitmer <rayw@netscape.com>
- Date: Mon, 01 Apr 2002 13:26:26 -0800
- To: www-xml-query-comments@w3.org
This was just a quick scan, again. I have not looked at XPath 2.0 itself, but only the data model. * It seems clear that the XPath 2.0 specification has no type comparable to the node set or other built-in types of XPath 1.0. The concept of a typeless sequence does not seem to work as effectively. In many languages, arrays of objects are typed. Although some people use untyped languages, those who rely on a certain level of typing are likely to complain when they lose that, as is being lost in this case. There is certain distress in worrying that your array of matching nodes might have strings interspersed in it, and applications which in XPath 1.0 relied on receiving sets only containing nodes are not going to be able to deal compatibly with a model which no longer is able to return that type of guarantee. * XPath 1.0 was based on explicitly unordered sets of nodes that could be accessed in order. XPath 2.0 claims that every sequence is ordered, but there is not sufficient discussion of what that means, which has caused significant confusion. The logical conclusion could be drawn that it is referring to document order, which is the only order it seems to define and was the order of XPath 1.0, but this makes no sense when considering non-node items now possible in the result sets. Also, the incompatible treatment of duplicates is confusing, if the sets are now ordered, rather than unordered, it seems pointless to not eliminate the duplicates, but there is probably something lost between the different versions of the specification. Based upon recent discussions, it seems that the XPath 2.0 specification may not be comparable or compatible with the XPath 1.0 specification in its use of these terms, but the specification needs better treatment of the concepts, and explanation of the impact on backwards compatibility. Elimination of duplicates also seems like a significant compatibility problem since 1.0 implementations went to great lengths to accomplish this. * The copy semantics of node constructors seems wrong even if it was the only way to model the lisp semantics that the authors of XPath 2.0 seem to be using throughout the specification. It would seem that a constructed node should not lose its identity when inserted into a hierarchy, but XPath 2.0 seems to mandate that. * section 4.1, collapse-text-node: what is the parent of the text node resulting from the collapse operation? What if the nodes of the operation have parents of different elements, or different documents? The example given using sequence-map claims to construct a new sequence of children nodes. Children of what? When it "collapses nodes", does this mutate the original node? If not, then a complete parallel hierarchy is required to accomodate this new node, because it cannot become a child of any existing node, nor can its ancestors. In any case the wording of the specification is internally inconsistent in describing this function. * "Descendant nodes" is used but not defined. Due to the confused use of parent relationships of XPath contradicting infoset and other models such as DOM, this is important and it can be unclear whether it includes attributes, namespaces, etc. where it is used. * That there should be document order between documents seems strange. This makes the ordering of namesace nodes all-the-more bizarre because they belong to no document and presumably may be shared between documents, so coming at the start of a document or (I can't say I follow the logic in this one) ordering after every other node in the document both seem impossible and broken. In every other case, there is some relationship between objects being ordered (excluding, again, namespaces, which seem to be global between documents now). Requiring document order between documents to be stable requires much better document identification than we have today, because if a document is persisted and brought back into memory, which can happen at any time during processing, you need to be able to go back to something to reestablish the sort in the same way. * The model claims: "The data model does not support XML documents that are not supported by the XML Information Set, for example, non-well-formed documents and documents that don't conform to XML Namespaces." But the constructors seem perfectly able to construct objects which are not well- formed, for example, by putting "--" into the text of a comment node or other illegal characters generally anywhere. * The model appears to make it possible to construct text nodes that have empty strings, elements with multiple ajacent text nodes, and other non- normalized result trees. There needs to be a section on what happens in those cases, since the XPath is inventing its own propgramming model here that is different from infoset and all other models such as DOM. * The model appears to make it possible to construct hierarchies which are not namespace-well-formed, but makes no mention of how processing will occur in those cases. At the very least, an attribute fragment is not namespace-well-formed if it uses namespaces. And the whole concept of how to construct elements properly with namespace nodes seems quite muddy, because it would seem to require complete knowledge of all of the ancestors to specify a list of namespaces that is consistent with all of its ancestors, since it would seem to be an error to ever pass a child to the constructor of a parent that does not already contain all the namespace nodes of the parent, since XML has no ability to undefine namespaces and this would represent an impossible infoset. But the spec seems quite silent on this issue. It would seem like the ancestor should be created first, not the child as the current API dictates, or convenience methods are required to correctly construct the hierarchy, because this problem will arise whenever an element is constructed as a child of an element. * In general, it is not clear what is constructed if the constructors are called in such a way as to produce non-well-formed results or results that cannot be expressed as XML. * The namespace node issues of ownership, order, identity, and backwards compatibility, have not been resolved, nor has a complete solution been proposed. * The list was longer, but I had a number that duplicate your existing issues. It is hard enough to get a good feeling for what the model looks like without getting lots of these resolved. I might suggest that you thoroughly study the DOM specification and you will find many more border cases you have missed. Construction of a hierarchy using an API is the same problem that DOM solves. Certainly more to come when we can get some of the basics satisfactorily resolved. Ray Whitmer rayw@netscape.com
Received on Monday, 1 April 2002 16:26:12 UTC