- From: Michael Kay <mike@saxonica.com>
- Date: Sat, 22 Sep 2007 22:58:45 +0100
- To: <public-xml-processing-model-comments@w3.org>
1. Editorial, section 5.1.1. It's not clear when reading the proforma for p:input here that there are other proformas for the same element elsewhere. 2. Technical/Editorial. Section 5.1.2. I don't really know if this is a technical problem or an editorial one. The parameter mechanism seems extremely confusing. Perhaps it just needs to be explained better, perhaps the design needs to be improved. I don't really understand it well enough after several readings to know. 3. Technical. Section 5.7.3. The default namespace bindings for option and parameter values seem horribly ad-hoc. This is trying to make intelligent choices, but I fear it will just be confusing. Also, I don't see how option 2 can work. It seems to suggest that you have to evaluate the select expression to find some nodes, whose namespaces then form the context for that select expression. This can't be right, I must have misunderstood something. Overall, the whole namespace handling here is horribly messy. Perhaps messiness and namespaces are inevitable bedfellows, but I think this area deserves further thought. 4. Clarification, section 5.13. para 5. Need to explain "any other processing". Other than what? Does validation here include both schema and DTD validation? (I'm not sure this is practical. Some parsers always do DTD validation if there is a reference to a DTD.). Why are p:document and p:load distinct? 5. Technical (Requirements). Section 7.1.1 et al. It's not clear to me that it's desirable for XProc to define fine-grained update operations like this. It seems to be crossing the boundary from a pipeline processor to yet-another-transformation-language. I think these operations can be adequately performed by invoking XSLT or XQuery (especially XQuery with updates), and that is the approach that should be taken. 6. Clarification. Section 7.1.5. The spec says that delete deletes, but it needs to explain that this is a deletion of the subtree rooted at a selected node. If we're going to provide fine-grained updating like this, restricting it to use an XSLT 1.0 match pattern (which can't reference any variables) seems to severely limit its utility. (also, the phrase "the resulting document with the deletions" seems clumsy. "after the deletions" would seem better.) 7. Typos. Section 7.1.6. Bullet 4, "criteria ... is" [->are]. Last-but-one para, "each ... has ... when they appear" [->when it appears]. Final para, "attributes ... is" [->are]. 8. Technical. Section 7.1.12 It is not clear why p:label-elements differs from other similar step types by taking a select attribute rather than a match attribute. There seems no logical reason why insert, delete etc shouldn't all take a select attribute. 9. Clarification. Section 7.1.13. What exactly is "namespace-aware DTD validation"? I thought DTDs were never namespace aware. 10. Technical. Section 7.1.18. Rename is under-specified. There have been considerable efforts in the XQuery WG to specify a workable rename operation. The questions are (a) how to deal with the case where the new name of an attribute is the same as that of an existing attribute, (b) whether to add or remove namespaces, (c) what to do about namespace prefixes. 11. Technical. Section 7.1.19. Replace. The functionality seems to be a subset of Viewport. Is a separate step type really needed? (This is also true for Delete). 12. Terminology. Section 7.1.25. Unescape markup. This seems a rather convoluted name for the operation usually called parsing. Also, the options "encoding" and "charset" seem poorly named, since the value of charset is what one would normally call an encoding. 13. Ambiguity. Section 7.1.26, last para, "may not" => "might not". 14. Technical. Section 7.1.27, Wrap. This only appears to be well-defined in the case of element nodes. Text nodes, PIs and comments will work, except for the group-adjacent provision. 15. Technical/Political. Section 7.1.30, XSLT. What happens if the stylesheet is an XSLT 2.0 stylesheet but is not a valid XSLT 1.0 stylesheet? It seems very short-sighted for a new W3C specification to mandate an obsolete version of another W3C specification. Within a couple of years there may well be environments that do not support an XSLT 1.0 processor, which will make it difficult/expensive/impossible to implement a conformant XProc processor. This also applies throughout to XPath. I would think that a better solution here is to define a single step type XSLT, with a version parameter 1.0 or 2.0, having an implementation-defined default, and to say that an XProc processor must support either version 1.0 or 2.0 or both. 16. Technical. Section 7.1.30, XSLT. How can the transformation use a non-XML output method such as "text" if its serialization parameters on xsl:output are ignored? 17. Technical Section 7.2.2 Schematron. It doesn't feel right to me to treat assertion failures as errors. Perhaps there should be an assert-valid attribute as in XML Schema Validate. 18. Nomenclature. Section 7.2.3 XML Schema Validate. I have already remarked that p:validate-xml-schema seems a poorly chosen name for a step that validates an instance. 19. Technical. Section 7.2.3 XML Schema Validate. "Set of schemas" in para 4 should be "Set of schema documents". Processors should be allowed to obtain schema components from sources other than these schema documents if available. It's not clear how the validation is carried out in terms of the various options provided in the XML Schema spec to initiate validation. It's desirable to allow an initial element declaration or type to be nominated, so that you can test not only that the document is valid but that it is valid against a particular element declaration or type. It's desirable to allow an option to indicate whether xsi:schemaLocation attributes within the instance should be used or not. 20. Technical. Section 7.2.3 XML Schema Validate. Leaving it implementation-defined whether PSVI annotations can be passed down the pipeline seems an interoperability nightmare. Better for users to say whether they expect this or not, and for a dynamic error to occur if it's requested but not supported. 21. Technical. Section 7.2.4 XQuery. (a) You need to say much more about the static and dynamic context of the query. (b) Many queries will operate on a single document, supplying this as the default collection seems clumsy. (c) It seems wrong when the query returns elements selected from the source document to wrap these in document nodes, which entails copying the elements and losing their identity and parentage: though that perhaps suggests a different mode of running XQuery in which it is used as an alternative to XPath for selecting nodes rather than transforming them to new nodes. (d) It seems odd to fail if the query returns things other than elements, why not apply the sequence normalization rules from section 2 of the serialization spec to the result? (e) taking the text node descendants of <c:query> to form the query seems a really bad idea, if there are elements present as in <c:query><result>for $x in 1 to 10 return <br/></result></c:query> then you are going to get some very hard-to-understand error messages, and sometimes you will actually construct a syntactically-correct query that's different from the one the user wrote. I think it's better to allow an XML representation of an XQuery, which can be defined as follows: take the subtree rooted at the c:query element; serialize it; then unescape any character or entity references appearing in text nodes that occur either (i) as children of c:query, or (ii) between curly braces (but not within quotes), and treat the result as an XQuery 1.0 query. This allows for example <c:query>if (x < 3) then <a/> else <b/></c:query>. 22. Technical. Section 7.2.5 xslt2. I have already commented on the relationship between 1.0 and 2.0. (a) A stylesheet creating multiple result documents will allocate each of them a URI. It's not clear how the processor can distinguish the result documents on the basis of their URIs. Nor is it clear how one would apply different serialization to different result documents. Perhaps there should be an option to cause secondary result documents to be serialized and written to the relevant disk location rather than being returned on the secondary output port. (b) there's no explicit provision to run the stylesheet without a principal input document. (c) it seems that only strings can be supplied as input parameters. (At the very least, these should be treated as untypedAtomic so they are implicitly converted to the required type. But really, there's a need to supply any value that can be yielded by an XPath 2.0 expression.). (d) the option "allow-collections" seems poorly named. Setting this to false does not disallow use of collections in the stylesheet. In fact, if the stylesheet is written to use collections, one might want to set this option to false to avoid interfering with this. 23. Technical. Section 7.3. doctype-public is a public identifier (so-called), not a URI. 24. Typo. Section 7.3 "must support be supported" Michael Kay Saxonica Limited
Received on Saturday, 22 September 2007 21:59:00 UTC