W3C home > Mailing lists > Public > www-validator@w3.org > April 2001

Re: XHTML validator doesn't completely support Unicode

From: Liam Quinn <liam@htmlhelp.com>
Date: Sun, 29 Apr 2001 21:33:59 -0400 (EDT)
To: Bertilo Wennergren <bertilow@bertilo.se.fm>
cc: <psheerin@cmp.com>, <www-validator@w3.org>
Message-ID: <Pine.LNX.4.30.0104292129360.8646-100000@localhost.localdomain>
On Sun, 29 Apr 2001, Bertilo Wennergren wrote:

> Peter Sheerin:
>
> > Take a look at http://www.petesguide.com/style/index.html, and then
> > follow the icon link to the validator, and watch what it reports. The
> > text file is encoded in UTF-8, and uses the DOS end of line
> > conventions, but has the Unicode string "U+FEFF" as the first character.
>
> Are you sure it's the end of line characters that give the problem?
>
> I'd guess it's the BOM ("U+FEFF") that's the culprit. It's not very
> common to use a BOM in UTF-8 files. Some even say it's not allowed
> in UTF-8.

According to <http://www.unicode.org/unicode/faq/utf_bom.html#25>, the BOM
is allowed in UTF-8.  Strange that the UTF-8 RFC makes no mention of it
though.

-- 
Liam Quinn
Received on Sunday, 29 April 2001 21:34:04 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:13:58 GMT