RE: XHTML and charset's [was: Re: XHTML questions]

On Thu, 29 Jun 2000, Christian Smith wrote:

> On Thursday, June 29, 2000 at 16:35, igraham@ic-unix.ic.utoronto.ca (Ian Graham) wrote:
> 
> > Bertilo is correct -- things are fine if your documet only
> > contains ASCII characters, as they map onto the same byte
> > sequence in UTF-8.
> > 
> > HOwever, things go wrong if you hav non-ascii characters
> > in the document. They also fail (on Navigator 4 and earlier)
> > if you have charcter references in the document that 
> > references non-latin-1 characters. For example, character
> > references like 
> > 
> > ఴ
> > 
> > (this is a made up number I'm afraid), which references the
> > 3124th character in Unicode, will only work if you explicitlyu
> > set UTF-8 using a META element.
> 
> And if you save a file as UTF-8 and include the UTF8 byte order mark, IE
> for the Macintosh at least doesn't deal with this very well (it renders
> the byte order mark as a garbage character). I don't know how well other
> browsers handle this.
> 

I think you mean UTF-16 (the two-byte encoding). UTF-8 doesn't use /
require a byte order mark, as all characters are encoded as a
stream of one, two, or more bytes, and the encoding rules uniquely 
define the ordering of the bytes (a byte stream). 

Ian

Received on Friday, 30 June 2000 09:29:11 UTC