- From: <tgindin@us.ibm.com>
- Date: Fri, 10 Mar 2000 11:02:16 -0500
- To: TAMURA Kent <kent@trl.ibm.co.jp>
- cc: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
One serious question here is whether the sort being used on UTF-16 has the same order as UTF-8 and UCS-4. I am pretty sure that those two have the same sort order if one uses unsigned bytes in UTF-8. If all three have the same sort order, which could then be described as "Unicode sort order", the implementation dependency would probably cause relatively little trouble. If not, this might be a serious problem. The primary likely reason for a violation of "Unicode sort order" would be the placement of characters from planes 1-16 before characters whose UCS value is between 0xE000 and 0xFFFF. A sort of UTF-16 in "Unicode sort order" would have to ignore the BOM and treat values 0xD800-0xDFFF as if their real values were above 0x10000. Tom Gindin TAMURA Kent <kent@trl.ibm.co.jp>@w3.org on 03/10/2000 02:31:13 AM Sent by: w3c-ietf-xmldsig-request@w3.org To: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org> cc: Subject: Re: Comments on last call draft > As for exact order in the XPath transform, are you saying that if we > eliminated exact order and used only lex order, then we could accomplish the > XPath transform with some set of existing tools? Yes when we use a DOM Level 2 implementation. We can not sort attributes before XPath processing because DOM has no reordering function. So we have to sort after XPath processing. <root> <child b="b-value"/> <child a="a-value"/> </root> "//child/@*" The result of applying the above XPath expression to the above document is a node-set consisted of two attribute nodes (Attr in DOM). We can not decide whether we may sort the node-set because Attr has no method returning its parent element and we can not know whether these attribute nodes are in the same element. Attr in DOM Level 2 has the getParentElement() method. > As for exprEncoding and exprBOM, you should note that you will be receiving > an XML document containing a signature element which contains the XPath > expression, the encoding of that document determines the encoding of the > XPath expression. I created exprEncoding and exprBOM as a way of making it > clear that the application must provide these pieces of information from the > document to the XPath transform precisely so that the expression could be > reencoded to a format that is suitable for your XPath expression evaluator We need not these information. When an XML processor parses a signature document, all characters including any XPath expression are represented in an internal encoding, that is UTF-16 in DOM. XPath evaluators in Java, Xalan/LotusXSL and XT, receive expressions in UTF-16 and they implements the XPath `string' instances as UTF-16 sequences. > As for re-using existing XML processors and XPath libraries, I'm quite > certain you are incorrect about not being able to use an existing XML > processor (Clark's parser, for example, can easily be used to create the > parse tree I've specified). I do not know XML processors that provide the BOM information of parsed document. Some XML processors provide the encoding information of the document entity with each original method but it is very difficult or impossible in existing XML processors to detect an encoding of specific node because XML documents may contain external entities. -- TAMURA Kent @ Tokyo Research Laboratory, IBM
Received on Friday, 10 March 2000 11:02:41 UTC