- From: John Boyer <jboyer@PureEdge.com>
- Date: Thu, 16 Mar 2000 10:26:00 -0800
- To: "Martin J. Duerst" <duerst@w3.org>
- Cc: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>, <w3c-xsl-wg@w3.org>
-----Original Message----- From: Martin J. Duerst [mailto:duerst@w3.org] Sent: Wednesday, March 15, 2000 7:22 PM To: 'John Boyer' Cc: IETF/W3C XML-DSig WG; w3c-xsl-wg@w3.org Subject: RE: XSL WG comments on XML Signatures I'm glad to see that the XSL experts are helping the XML signature group to improve their use of XPath. <John>Yes, their luminance leaves us in awe. How can we ever achieve such greatness?</John> I'm adding a few comments below so that when XPath filtering is rewritten, these issues can be cleaned up at the same time. <John> XPath filtering will not be substantially rewritten. Based on Clark's feedback, we can remove the parse function and instead simply assert that the transform input is parsed and provided to XPath as a node set. The notions of lex and exact order will be removed (since we cannot directly specify the parse). The serialize function will stay since we must now modify it to say that it will lex order the attributes as part of its natural behavior. I'm sure we can also say that it gets called automatically if the expression results in a node-set. </John> The i18n WG/IG was also rather lost on seeing $exprEncoding and $exprBOM. The XPath engine of course has to know in which encoding the XPath expression was in. But that's the XPath engine, not the XPath expression. It would be extremely strange to include some switch in an XPath expression saying something like If the XPath expression came in as Shift_JIS, then do A, else do B. <John> The problem that this lowly piece of dark matter saw was simply that the transform input could be in a different encoding than the document containing the signature (and hence the XPath expression). Suppose a signature in a UTF-16 document contains a URI to an XML document that is encoded in UTF-8. The result of the URI dereference is a UTF-8 document, whose tag names, attributes, etc. are incomparable to the conditions set forth in the XPath expression. Unless you convert the XPath expression to the same encoding as the XML document, or convert the XML document to the same encoding as the expression, then you will not be able to evaluate the expression. At a minimum, a particular implementation could throw an exception if it couldn't handle the conversion. The alternative to this solution is to standardize on a particular encoding, which would imply that every XML document would have to be converted before running XPath, which is not a very efficient solution. </John> As for the BOM, the same arguments apply. In addition, the XPath expression is element (or attribute?) content, so there is never a BOM in front of it. To distinguish between BE and LE, just use UTF-16-BE and UTF-16-LE if needed. <John> The XML document we receive as input to the transform MUST have a BOM in front of it if it is UTF-16, according to the XML spec. So, obviously exprBOM doesn't refer to that. You are exactly right that there is no BOM in front of the XPath expression string, which is exactly why the context needed to be told what the BOM should be (unless you weren't planning to solve the encoding problem). The problem is that if one application reads a UTF-8 document and leaves it in UTF-8, then the output will be UTF-8, which implies one digest value. If another tool reads the UTF-8 then converts to UTF-16 because of some limitation on their XPath expression engine, then the output will be UTF-16 (unless they take the special effort of converting back to UTF-8 (???) to overcome the limitation of their toolset). So, a signature created by the first product would not verify in the second product. John Boyer Software Development Manager PureEdge Solutions, Inc. (formerly UWI.Com) jboyer@PureEdge.com </John> Regards, Martin. #-#-# Martin J. Du"rst, I18N Activity Lead, World Wide Web Consortium #-#-# mailto:duerst@w3.org http://www.w3.org/People/D%C3%BCrst
Received on Thursday, 16 March 2000 13:23:56 UTC