W3C home > Mailing lists > Public > www-html@w3.org > June 2000

RE: XHTML and charset's [was: Re: XHTML questions]

From: Ian Graham <igraham@smaug.java.utoronto.ca>
Date: Fri, 30 Jun 2000 09:52:51 -0400
To: Jim Correia <correia@barebones.com>
cc: www-html@w3.org, Ian Graham <ian.graham@utoronto.ca>
Message-ID: <Pine.SGI.4.05.10006300949170.120695-100000@smaug.java.utoronto.ca>

On Fri, 30 Jun 2000, Jim Correia wrote:

> On 9:29 AM 6/30/00 Ian Graham <igraham@smaug.java.utoronto.ca> wrote:
> 
> > I think you mean UTF-16 (the two-byte encoding). UTF-8 doesn't use /
> > require a byte order mark, as all characters are encoded as a
> > stream of one, two, or more bytes, and the encoding rules uniquely 
> > define the ordering of the bytes (a byte stream). 
> 
> Had he meant UTF-16, he probably would have said so.
> 
> You cannot byte swap a UTF-8 file due to the nature of the encoding, but
> it is still desirable at times to include the UTF-8 BOM, EF BB BF, to
> indicate that the following stream of characters is indeed encoded as
> UTF-8 and not something else.
> 
> <http://www.unicode.org/unicode/faq/#BOM>

I did not know that, but it make sense. Thanks for the correction.

Ian
Received on Friday, 30 June 2000 09:52:53 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:15:43 GMT