- From: Liam Quinn <liam@htmlhelp.com>
- Date: Sun, 29 Apr 2001 16:54:17 -0400 (EDT)
- To: Peter Sheerin <psheerin@cmp.com>
- cc: <www-validator@w3.org>
On Thu, 26 Apr 2001, Peter Sheerin wrote: > Is it a known issue that the w3c validator doesn't properly handle > Unicode documents? I've got a page that validates to XHTML 1.0 > Strict--until I put the Unicode byte-order mark character string at the > beginning of the file. > > Take a look at http://www.petesguide.com/style/index.html, and then > follow the icon link to the validator, and watch what it reports. The > text file is encoded in UTF-8, and uses the DOS end of line conventions, > but has the Unicode string "U+FEFF" as the first character. I've fixed this in lq-nsgmls 1.3.4.5, which is now in action at <http://www.htmlhelp.com/tools/validator/>. Your page validates there. The new lq-nsgmls can be downloaded from <http://www.htmlhelp.com/tools/validator/src/lq-sp-1.3.4.5.tar.gz>. I believe there was only a problem with the BOM in UTF-8. (In any case, I only changed the UTF-8 code.) -- Liam Quinn
Received on Sunday, 29 April 2001 16:54:35 UTC