W3C home > Mailing lists > Public > www-validator@w3.org > April 2001

Re: XHTML validator doesn't completely support Unicode

From: Liam Quinn <liam@htmlhelp.com>
Date: Sun, 29 Apr 2001 16:54:17 -0400 (EDT)
To: Peter Sheerin <psheerin@cmp.com>
cc: <www-validator@w3.org>
Message-ID: <Pine.LNX.4.30.0104291647410.8646-100000@localhost.localdomain>
On Thu, 26 Apr 2001, Peter Sheerin wrote:

> Is it a known issue that the w3c validator doesn't properly handle
> Unicode documents? I've got a page that validates to XHTML 1.0
> Strict--until I put the Unicode byte-order mark character string at the
> beginning of the file.
> Take a look at http://www.petesguide.com/style/index.html, and then
> follow the icon link to the validator, and watch what it reports. The
> text file is encoded in UTF-8, and uses the DOS end of line conventions,
> but has the Unicode string "U+FEFF" as the first character.

I've fixed this in lq-nsgmls, which is now in action at
<http://www.htmlhelp.com/tools/validator/>.  Your page validates there.

The new lq-nsgmls can be downloaded from

I believe there was only a problem with the BOM in UTF-8.  (In any case, I
only changed the UTF-8 code.)

Liam Quinn
Received on Sunday, 29 April 2001 16:54:35 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:29 UTC