Re: XHTML validator doesn't completely support Unicode from Liam Quinn on 2001-04-29 (www-validator@w3.org from April 2001)

From: Liam Quinn <liam@htmlhelp.com>
Date: Sun, 29 Apr 2001 16:54:17 -0400 (EDT)
To: Peter Sheerin <psheerin@cmp.com>
cc: <www-validator@w3.org>
Message-ID: <Pine.LNX.4.30.0104291647410.8646-100000@localhost.localdomain>

On Thu, 26 Apr 2001, Peter Sheerin wrote:

> Is it a known issue that the w3c validator doesn't properly handle
> Unicode documents? I've got a page that validates to XHTML 1.0
> Strict--until I put the Unicode byte-order mark character string at the
> beginning of the file.
>
> Take a look at http://www.petesguide.com/style/index.html, and then
> follow the icon link to the validator, and watch what it reports. The
> text file is encoded in UTF-8, and uses the DOS end of line conventions,
> but has the Unicode string "U+FEFF" as the first character.

I've fixed this in lq-nsgmls 1.3.4.5, which is now in action at
<http://www.htmlhelp.com/tools/validator/>.  Your page validates there.

The new lq-nsgmls can be downloaded from
<http://www.htmlhelp.com/tools/validator/src/lq-sp-1.3.4.5.tar.gz>.

I believe there was only a problem with the BOM in UTF-8.  (In any case, I
only changed the UTF-8 code.)

-- 
Liam Quinn

Received on Sunday, 29 April 2001 16:54:35 UTC