W3C home > Mailing lists > Public > w3c-ietf-xmldsig@w3.org > July to September 2000

Re: UTF-8 and BOM

From: TAMURA Kent <kent@trl.ibm.co.jp>
Date: Wed, 23 Aug 2000 10:13:36 +0900
Message-Id: <200008230113.KAA10086@ns.trl.ibm.com>
To: "XML DSig" <w3c-ietf-xmldsig@w3.org>

In message "UTF-8 and BOM"
    on 00/08/22, "John Boyer" <jboyer@PureEdge.com> writes:
> I'm still unsure why one would ever need a BOM for UTF-8.  I thought the
> point of UTF-8 was to have a format that could provide lots of Unicode/UCS
> characters but not be subject to the endian disease.
> 
> Still, I'm sure there is a reason, so could someone please explain it?

UTF-8 without BOM is compatible with US-ASCII for ASCII
characters.  So, an application might recognize that the
encoding of a UTF-8 text is another US-ASCII compatible
encoding.  The BOM in UTF-8 is expected to work as the UTF-8
signature to distinguish from US-ASCII compatible encodings

-- 
TAMURA Kent @ Tokyo Research Laboratory, IBM
Received on Tuesday, 22 August 2000 21:14:14 GMT

This archive was generated by hypermail 2.2.0 + w3c-0.29 : Thursday, 13 January 2005 12:10:10 GMT