- From: Martin J. Duerst <duerst@w3.org>
- Date: Fri, 01 Dec 2000 02:29:58 +0900
- To: "John Boyer" <jboyer@PureEdge.com>, "Paul Hoffman / IMC" <phoffman@imc.org>, <w3c-ietf-xmldsig@w3.org>
Hello John, There is no problem with UCS-2 and UCS-4. The UCS is a set (in the math sense) of characters, each with a number associated. There is only one UCS. Just saying 'UCS', there are no assumptions whatsoever about representation (UCS-2 and UCS-4 are both 'charset' labels), and no assumptions about subsetting (UCS-2 can be used, in the right context, to denote a certain subset of the UCS). So I don't see any problem. Regards, Martin. At 00/11/30 08:45 -0800, John Boyer wrote: >Maybe it would be best for you to wait for Martin to clear this up. > >--Paul Hoffman, Director >--Internet Mail Consortium > ><john> >Yes, after all, it is Martin's sentence in the first place, so I would be >uncomfortable with anything that didn't have his buyoff. > >As for the info you provided, thanks, it was very helpful. Actually, it is >the case that we only needed to know the answer you provided because, while >I don't know a lot about encodings, I do think our question was really >simple and I have most of the resources available. The only thing that >remained was: Do we or do we not include UCS-2 in the list of settings for >the XML declaration's encoding attribute (plus the defaulting mechanism for >that attribute) under which we decide not to NFC when converting to the UCS >domain, as required by the XPath data model. > >Unfortunately, your answer on the representation power of UCS-2 vs. UCS-4 >points out another problem we didn't know about before. The fact that UCS-2 >can only encode the BMP means that there is an ambiguity in the XPath data >model when it says that everything is represented in the UCS character >domain. Which one? This use of UCS without specifying which one lead me to >believe there was no difference. Clearly, your information indicates >otherwise. Moreover, the closest I come to a 'hint' at which one in the >XPath spec is the bibliographic citation of ISO 10646, which specifically >mentions "Part 1: Architecture and Basic Multilingual Plane". This would >seem to imply a focus on the BMP. In contradiction though, XML should be >expressible in UTF-8, which can represent all of UCS-4. > >The problem is that we now have another conformance criterion for >canonicalization and hence for signatures. ></john>
Received on Thursday, 30 November 2000 13:17:44 UTC