W3C home > Mailing lists > Public > www-international@w3.org > April to June 2000

RE: BOM & Unicode editors

From: Saba Sundaramurthy <ssundaramurthy@verisign.com>
Date: Thu, 11 May 2000 10:09:52 -0700
Message-ID: <C713C1768C55D3119D200090277AEECA0117DD6A@postal.verisign.com>
To: "'Robert A. Rosenberg'" <rarpsl@flashcom.net>
Cc: mozilla-i18n@mozilla.org, www-international@w3.org, i18n-prog@acoin.com
	UTF-8 characters may expand to any number of bytes (up to 6 for
UCS-4), I don't think byte order is important since the sequence will be
written out one byte at a time in the correct order.

    As confirmed by Michka, the BOM is placed in UTF-8 files only as a
'magic cookie'.

Saba


> -----Original Message-----
> From: Robert A. Rosenberg [mailto:rarpsl@flashcom.net]
> At 10:43 AM 05/10/2000 +0200, Chris Lilley wrote:
> >This is all fine and well for UTF-16, but what about UTF-8 ? 
> why does the
> >byte order matter?
> 
> The byte-order is still important since it controls what 
> UTF-8 codes get 
> emitted for the same input codepoint. Just as you need to 
> know which order 
> to save the two bytes of a UTF-16 character, you need to know 
> what order to 
> assemble the two bytes that get created by expanding a UTF-8 sequence.
> 
Received on Thursday, 11 May 2000 13:10:43 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:55 GMT