W3C home > Mailing lists > Public > www-validator@w3.org > April 2003

Re: utf-8 validation error

From: David Thielen <dave@windward.net>
Date: Mon, 7 Apr 2003 20:20:56 -0600
Message-ID: <023601c2fd75$7cd24070$69f72dc7@BAMBI>
To: "Lloyd Wood" <L.Wood@eim.surrey.ac.uk>, "Brant Langer Gurganus" <brantgurganus2001@cherokeescouting.org>
Cc: <www-validator@w3.org>

I thought utf8 worked as follows:

Table 2   UTF-8 Bit Encoding of a Unicode Code Point


   Character Range Bit Encoding
      U+0000 - U+007F 0xxxxxxx
      U+0080 - U+07FF 110xxxxx 10xxxxxx
      U+0800 - U+FFFF 1110xxxx 10xxxxxx 10xxxxxx
      U+10000 - U+10FFFF 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx

(from:
http://www.sun.com/developers/gadc/technicalpublications/articles/utf8.html)

Which would require characters > 127. Isn't this how we are supposed to
display unicode in a web page? This does work in IE - I can mix Russian and
Chinese.

thanks - dave


----- Original Message -----
From: "Lloyd Wood" <l.wood@eim.surrey.ac.uk>
To: "Brant Langer Gurganus" <brantgurganus2001@cherokeescouting.org>
Cc: "David Thielen" <dave@windward.net>; <www-validator@w3.org>
Sent: Monday, April 07, 2003 6:35 PM
Subject: Re: utf-8 validation error


> On Mon, 7 Apr 2003, Brant Langer Gurganus wrote:
>
> > David Thielen wrote:
> >
> > > The validator gives me an error on characters > 127
> >
> > Characters between 128 and 256 are PC-specific characters.
>
> unless you're using a Mac, and its equally weird high-bit charset...
> either way, interpreted differently across platforms.
>
> L.
>
> <http://www.ee.surrey.ac.uk/Personal/L.Wood/><L.Wood@ee.surrey.ac.uk>
>
Received on Monday, 7 April 2003 22:21:08 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:08 GMT