- From: Joseph M. Reagle Jr. <reagle@w3.org>
- Date: Tue, 12 Dec 2000 13:46:43 -0500
- To: "IETF/W3C XML-DSig WG" <w3c-ietf-xmldsig@w3.org>
Older message that wasn't sent to this list, but forwarding it on for issues
list completeness.
-----Original Message-----
From: Martin J. Duerst [mailto:duerst@w3.org]
Sent: Thursday, November 30, 2000 8:23 AM
To: John Boyer; Tom Gindin
Cc: w3c-archive@w3.org
Subject: RE: Character Encoding Question
At 00/11/29 10:12 -0800, John Boyer wrote:
>Hi Tom and Martin,
>
>Actually, it appears that the info on p. 20 of Unicode Standard 3.0 was
>slightly misleading. They talk of using UTF-8 as an encoding format.
>However, while I think of UTF-8 as encoding all of UCS-4, they appear to be
>only using UTF-8 to encode the portion of UCS-4 that Unicode represents,
>which is the 16 x 64k character regions that compose the BMP.
>
>So, the prior sentence was still sufficient. The following would appear to
>do the trick:
>
>"use Normalization Form C [NFC] when converting an XML document to the UCS
>character domain from an encoding other than UCS-4, UTF-8, UTF-16,
UTF-16BE,
>or UTF-16LE."
I think you are close with the above, but I think you should change it to
"use Normalization Form C [NFC] when converting an XML document to the UCS
character domain from any encoding that is not UCS-based (currently,
UCS-based
encodings include UTF-8, UTF-16, UTF-16BE, and UTF-16LE, UCS-2, and UCS-4)."
Why my change:
- There are also others in the IANA registry
(look e.g. for 'unicode' or 'iso10646').
- There are things we know apply but we don't want to mention (UTF-7).
- We don't know what other might come up (hopefully none :-).
- UCS-2 is mentioned because it's not the same as UTF-16.
Please feel free to send this to the involved lists for further
checking.
Regards, Martin.
>Note, that UCS-2 and Unicode seem to be equal, and seem to be encoded in
any
>of the above UTF-16 formats (i.e. UTF-16 *is* Unicode). This is why I did
>not mention them in the list. OK?
>
>Thanks again,
>John Boyer
>Team Leader, Software Development
>Distributed Processing and XML
>PureEdge Solutions Inc.
>Creating Binding E-Commerce
>v: 250-479-8334, ext. 143 f: 250-479-3772
>1-888-517-2675 http://www.PureEdge.com <http://www.pureedge.com/>
__
Joseph Reagle Jr.
W3C Policy Analyst mailto:reagle@w3.org
IETF/W3C XML-Signature Co-Chair http://www.w3.org/People/Reagle/
Received on Tuesday, 12 December 2000 13:46:48 UTC