Re: BOM & Unicode editors

Saba Sundaramurthy wrote:

>         UTF-8 characters may expand to any number of bytes (up to 6 for
> UCS-4), I don't think byte order is important since the sequence will be
> written out one byte at a time in the correct order.
>
>     As confirmed by Michka, the BOM is placed in UTF-8 files only as a
> 'magic cookie'.

That mean 0xEF 0xBB 0xBF as the first 3 bytes in a text file mean a UTF-8
file on Win2K, right ?

>
>
> Saba
>
> > -----Original Message-----
> > From: Robert A. Rosenberg [mailto:rarpsl@flashcom.net]
> > At 10:43 AM 05/10/2000 +0200, Chris Lilley wrote:
> > >This is all fine and well for UTF-16, but what about UTF-8 ?
> > why does the
> > >byte order matter?
> >
> > The byte-order is still important since it controls what
> > UTF-8 codes get
> > emitted for the same input codepoint. Just as you need to
> > know which order
> > to save the two bytes of a UTF-16 character, you need to know
> > what order to
> > assemble the two bytes that get created by expanding a UTF-8 sequence.
> >

Received on Saturday, 13 May 2000 16:36:48 UTC