- From: John Boyer <jboyer@PureEdge.com>
- Date: Thu, 16 Mar 2000 11:31:40 -0800
- To: "James Clark" <jjc@jclark.com>, "Martin J. Duerst" <duerst@w3.org>
- Cc: "Jonathan Marsh" <jmarsh@microsoft.com>, "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>, <w3c-xsl-wg@w3.org>
From James Clark: What's twisted and bizarre is using an extension function to get an initial handle on the source document, instead of using the mechanism that is built into XPath namely the context node. This will make XPath expressions used in your application confusingly different from XPath expressions used everywhere. <John> Where the node-set comes from did not and still does not seem confusing to me, but it seems to be such a thorn, so obviously parse will be thrown out (see below). Further, you still didn't answer the question of what I see as an oversight in the XPath specification. You must be able to use empty node-sets by your own specification, so what should the context node, size and position be for an empty node-set? </John> Because you're not using the language in the way it is designed, there are likely to be subtle traps. For example, you will lose the ability to use the id() function (because functions are not allowed as the right hand operand of /). It's also going to be very awkward. Instead of something like e1|e2|e3|e4 you will have to write parse($input,true())/e1|parse($input,true())/e2|parse($input,true())/e3|pars e($input,true())/e4 <John> Parsing 4 times in this case is simply not necessary: parse($input,true())/descendant-or-self::node()[id("e1")|id("e2")|id("e3")|i d("e4")] However, it really doesn't matter to me if the spec were modified slightly to eliminate parse() and instead pass the information in a context node-set as you recommend. But while I have the attention of an authority on the issue, perhaps you could answer a few other questions. Firstly, would it be fair to assume that any XPath implementation will have at least two points of entry: eval() and parse()? If the XPath transform must provide the input to the XPath engine as a context node-set, then presumably the node-set must be something that is understandable by the XPath engine. Hence, the XPath engine must provide the parse() in order to allow it to build a node-set data structure that it will understand when eval() is called. In fact, I would not be surprised to see entry points for construct variable bindings, construct namespace bindings, and construct initial context, since all of these are data structures that the Xpath engine's eval function will need. Secondly, could you suggest whether we should restrict the transform input to be a well-formed XML document? The transform input is defined by the WG (not by me) to be a string of data resulting either from the previous transform or from URI dereference. So, *something* must be used to convert to a node-set. Currently, I would be using your parser, and I would prefer not to have to use pieces of your parser to read something which is not a complete, well-formed XML document. Is this reasonable? Finally, since you only discussed problems with parse, I assume you have no objection to the serialize() function. This is actually the main function that we need, as expressed previously in DSig WG face-to-face meetings. The transform output MUST be a string, which may be given to a digest algorithm or as input to a subsequent transform. The result of an XPath expression may not be a string. If it is a boolean or number, we can simply call string on it implicitly, but if the result is a node-set, then we need to regenerate the actual XML markup for that node-set based on document order (and lex order only for the attribute and namespace axes). Making this into a function that returns a string, and adding that to the function library seems to be the XPath specification's recommended way of extending XPath in a particular application. Is this reasonable? Thanks, John Boyer Software Development Manager PureEdge Solutions, Inc. (formerly UWI.Com) jboyer@PureEdge.com </John>
Received on Thursday, 16 March 2000 14:29:51 UTC