Re: UTF-8 and BOM from TAMURA Kent on 2000-08-23 (w3c-ietf-xmldsig@w3.org from July to September 2000)

From: TAMURA Kent <kent@trl.ibm.co.jp>
Date: Wed, 23 Aug 2000 10:13:36 +0900
To: "XML DSig" <w3c-ietf-xmldsig@w3.org>
Message-Id: <200008230113.KAA10086@ns.trl.ibm.com>

In message "UTF-8 and BOM"
    on 00/08/22, "John Boyer" <jboyer@PureEdge.com> writes:
> I'm still unsure why one would ever need a BOM for UTF-8.  I thought the
> point of UTF-8 was to have a format that could provide lots of Unicode/UCS
> characters but not be subject to the endian disease.
> 
> Still, I'm sure there is a reason, so could someone please explain it?

UTF-8 without BOM is compatible with US-ASCII for ASCII
characters.  So, an application might recognize that the
encoding of a UTF-8 text is another US-ASCII compatible
encoding.  The BOM in UTF-8 is expected to work as the UTF-8
signature to distinguish from US-ASCII compatible encodings

-- 
TAMURA Kent @ Tokyo Research Laboratory, IBM

Received on Tuesday, 22 August 2000 21:14:14 UTC