W3C home > Mailing lists > Public > www-validator@w3.org > August 1999

Re: W3C validator and charset

From: Alan J. Flavell <flavell@mail.cern.ch>
Date: Mon, 9 Aug 1999 05:21:27 -0400 (EDT)
To: Dan Connolly <connolly@w3.org>
cc: www-validator@w3.org
Message-ID: <Pine.HPP.3.95a.990809111307.7559A-100000@hpplus03.cern.ch>
On Sun, 8 Aug 1999, Dan Connolly wrote:

> Found this in my mailbox... I think it's been fixed.

I don't believe so, nor does the changes list for the W3C validator show
any mention of such a fix. 

  Character encoding: utf-8 
     Level of HTML: HTML 4.0 Transitional. 

Below are the results of attempting to parse this document with an SGML

Error at line 32:
   <P LANG="ru">’Ы ’Ы•—–•Т•
                 non SGML character number 146

The WDG's validator does not have this problem, and has no difficulty
processing the same page. 

> If not, please report the problem to www-validator@w3.org

Here we are.

all the best

> 20 Jul 1998 "Alan J. Flavell" wrote:
> > 
> > Greetings, I wonder whether I could interest you in this one.
> > 
> > It appears that the W3C validator doesn't respond appropriately to
> > documents that are sent to it with a non-iso-8859-1 charset, for example
> > utf-8.  It rejects octets in the range 128-159 decimal as if they were
> > illegal, when in this charset they are perfectly legal and indeed
> > necessary.  Apparently the same would be true with even 8-bit character
> > codes where this range of octet values is assigned to printable
> > characters, such as koi8-r, Mac, or Windows codes.
> > 
> > So it seems that there may be some functionality missing from the
> > validator in this area.
> > 
> > I know that A.Prilop has made several attempts to bring this to the
> > attention of Gerald O, but apparently without receiving any kind of
> > answer.  As I've had discussions with you in the past on this topic
> > area, I wondered whether I could interest you in this issue.  It's a
> > pity for this otherwise excellent service to fail us just when there's a
> > growing interest in using utf-8 document coding (although, of course, a
> > more-general solution, that permitted codings such as koi8-r etc.  to be
> > validated too, not only the unicode encodings, would be ideal).
> > 
> > I have to admit I'm not aware of whether the underlying software has
> > this functionality already; at the very least I would have thought the
> > validator pages might mention this shortcoming, if it cannot be fixed
> > quickly.
> > 
> > all the best
> -- 
> Dan Connolly, W3C
> http://www.w3.org/People/Connolly/
> tel:+1-512-310-2971 (office, mobile)
> mailto:connolly.pager@w3.org (put your tel# in the Subject:)


  "I have no problem with cute and clever.  In fact I actually _like_ cute
  and clever.  I don't think it's clever to be cute in such a way as to
  make the pages less useful.        But then I'm not a graphic designer."
                        -     Calum I Mac Leod on c.i.w.a.site-design
Received on Monday, 9 August 1999 12:03:59 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 1 March 2016 14:17:25 UTC