- From: Michael Kay <mhkay@iclway.co.uk>
- Date: Fri, 15 Jun 2001 16:03:33 +0100
- To: <w3c-xsl-query@w3.org>, <www-xml-query-comments@w3.org>
Another round of comments on this steadily-improving document: 1. The abstract and the first sentence of the introduction state (if you follow the link) that this is the data model of XSLT 1.0; it isn't. 2. Section 3.2 states that in the document ordering, the namespace nodes of an element follow the element but precede its attributes. This is inconsistent with the idea, suggested but not spelled out in 4.4, that a namespace node can be shared by several elements. In fact, the question of namespace node identity is not really tackled. My view is that namespace node identity should be determined by the combination of (document identity, namespace prefix, namespace URI), that the parent of a namespace node should be the document node, and that namespace nodes should be ordered after every other node in the document. (This is easier for implementations than placing them at the start of the document, because the number of namespace nodes is not known until parsing is complete). 2a. Section 3.2: the second paragraph contains two sentences, the second one starts "In other words". But the two sentences seem to be making quite separate points, both of them valid. 2b. Section 3.2: does the concept of document order apply to nodes that are not part of a document, ie. nodes that belong to a tree whose root is not a document node? How can document order be stable in such cases, when the constructor functions allow a node to be added to a tree as a separate operation from creating the node? 3. Section 3.3 states that the data model does not support non-well-formed documents, but section 4.1 states "the data model is more permissive: it permits more than one element node as a child and also permits text nodes as children". 4. In section 4, I think the note that attempts to explain the difference between XPath 2.0 document nodes and XPath 1.0 root nodes is spurious. There may turn out to be differences in usage, but at the level of the data model, they are identical. A more important difference to highlight, and one that jutisfies the change in terminology, is that XPath 2.0 trees may have a node other than a document node as their root. (Though I question whether this is actually a good idea...) 5. In section 4, it is stated that an attribute contains "a sequence of simple-typed values", whereas an element may contain either a simple-typed value or a sequence of simple-typed values. This appears to make a distinction that doesn't actually exist: in both cases, a singleton is a special case of a sequence. 6. In section 4, it is stated that an expanded QName contains a namespace URI. It may contain no namespace URI. No accessor functions for obtaining the two parts of an expanded QName are provided. 7. In section 4.2 Elements, the notion that the constructor makes a copy of the supplied child nodes seems strange. It's hard to square this with the definition of node identity. Also, I don't see why the provision is needed here, but not for the document node constructor. Wouldn't it be cleaner to define a precondition that all the child nodes supplied to the constructor must be parentless? 8. In section 5.1, the notion that you can get from an ID or anyURI value to an Element node seems to assume that the ID or anyURI primitive value carries information about what document it came from. I'm not sure this is realistic. Does it mean, for example, that the ID "X123" in one document is not equal to the ID "X123" in a different document? Related to this, this section uses the phrase "a document that is not contained in the data model". This seems to imply some kind of closed-world (or "database") assumption, namely that there exists some finite collection of documents associated with the data model (and even that it's a containment relationship, which means a document cannot be associated with two different data models). But hang on, surely there is only one data model, the one defined by this specification? 9. Section 5.2, on "derived simple values", contains statements which seem to apply to all simple values, not only derived ones. 10. It would be useful if section 6 (Sequences) established terminology for describing the members of the sequence. My preference would be "members". There is also a need for a term that is generic over nodes and simple-values; the document uses "unit values" which doesn't seem very nice. I'd suggest: "An item is a node or a simple-value. The items contained in a sequence are referred to as the members of the sequence". 11. In section 6, head would appear to be a partial function, it does not apply to empty sequences. If we follow the same conventions as elsewhere, that means head returns a Sequence(0,1)<item>, which perhaps begs the question as to how you extract the first member of this sequence... 12. If the string-value of a sequence is the concatenation of the string-values of its members, then the string value of an empty sequence is an empty string; which I like, but which violates the general rule that any function applied to an empty sequence returns an empty sequence. The consequences of this depend on how the query semantics make use of the concept of a sequence having a string-value. It might be worth pointing this out. 13. In section 8, the accessor "parent" returns a sequence of zero or one SchemaComponents. But a SchemaComponent is not a node or a simple-value, so it cannot appear in a sequence. 14. Section 9 states "we assume that equality over simple-values is defined". This seems an optimistic assumption. Keep up the good work! Mike Kay
Received on Friday, 15 June 2001 11:01:31 UTC