W3C home > Mailing lists > Public > w3c-ietf-xmldsig@w3.org > July to September 2000

Re: UTF-8 and BOM

From: Martin J. Duerst <duerst@w3.org>
Date: Wed, 23 Aug 2000 11:54:07 +0900
Message-Id: <4.2.0.58.J.20000823115209.032804c0@sh.w3.mag.keio.ac.jp>
To: tgindin@us.ibm.com, "Joseph M. Reagle Jr." <reagle@w3.org>
Cc: "John Boyer" <jboyer@PureEdge.com>, "XML DSig" <w3c-ietf-xmldsig@w3.org>
At 00/08/22 17:41 -0400, tgindin@us.ibm.com wrote:
>      Why do we warn people about BOM but not about surrogates, anyway?  One
>is no more appropriate than the other in canonicalized UTF-8.

The difference is that surrogate pairs are explicitly disallowed
by the relevant specs (ISO 10646, Unicode, RFC 2379), but the BOM
issue is not mentioned in RFC 2379 and is as far as I remember
explicitly allowed in ISO 10646 and Unicode.


Regards,  Martin.
Received on Tuesday, 22 August 2000 22:52:32 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.29 : Thursday, 13 January 2005 12:10:10 GMT