- From: Paul Hoffman / IMC <phoffman@imc.org>
- Date: Wed, 29 Nov 2000 15:15:21 -0800
- To: w3c-ietf-xmldsig@w3.org
At 12:18 PM -0800 11/29/00, John Boyer wrote: >Still your question is valid because UCS-4 contains code points >outside of the BMP, and UTF-8 is capable of encoding them, while >Unicode/UCS-2/UTF-16x is not. That is an incorrect statement. UTF-16 is able to encode things outside the BMP just fine, the the method for doing so is specified in the Unicode Standard an in RFC 2781. > While nothing currently exists out there, This is also not true: there are private use areas allocated in planes 15 and 16. > I think ISO/IEC 10646-2 is supposed to change that fact, so it >would be helpful for us to change our sentence about the conditions >under which we expect the application of Normalization Form C to >occur. This all started with a statement: "REQUIRED to use Normalization Form C [NFC] when converting an XML document to the UCS character domain from a non-Unicode encoding". This was a bit of shorthand on the part of whoever wrote it. Simply change "a non-Unicode encoding" to "any non-UCS encoding" or "any local encoding". >In conclusion, it would be helpful to know whether anyone thinks >UTF-7 >(<http://www.ietf.org/rfc/rfc2152.txt>http://www.ietf.org/rfc/rfc2152.txt) >should be included since it does claim to be a format for encoding >Unicode characters. Oh God no. UTF-7 was a mistake and has, thankfully, never been widely adopted. The only real use of UTF-7 is in IMAP and everyone there deeply regrets it. Pretend that you never heard of UTF-7. --Paul Hoffman, Director --Internet Mail Consortium
Received on Wednesday, 29 November 2000 18:15:28 UTC