W3C home > Mailing lists > Public > www-validator@w3.org > November 2007

Re: Fallbeck to UTF-8

From: olivier Thereaux <ot@w3.org>
Date: Thu, 29 Nov 2007 10:52:47 +0900
Cc: www-validator@w3.org
Message-Id: <734E38AE-E66E-4B26-A583-2F81C333C3DC@w3.org>
To: Andreas Prilop <aprilop2007@trashmail.net>


On 29 nov. 07, at 02:15, Andreas Prilop wrote:
> I still believe that the following behaviour is illogical and
> not really helpful. (It has been discussed before.)
>
>
> Given a webpage that does not specify any encoding (charset).
>
> Then validator.w3.org reports:
>
> (1) No Character Encoding Found! Falling back to UTF-8.
>
> (2) Sorry, I am unable to validate this document because on line ...
>    it contained one or more bytes that I cannot interpret as utf-8
>
> This makes no sense; and it doesn't help the user.

You're not suggesting a better procedure, either. As far as I can  
tell, the alternative (as done by other tools) is to simply throw a  
fatal error whenever no charset is given. Trying to fall back to utf-8  
at least helps in some cases. Better than nothing IMHO.

Maybe what you would like is a different error message? Instead of  
"sorry I am not able to validate because it is not utf-8", in the case  
of a charset fallback, say something like "sorry, I am not able to  
read this document because it does not declare any encoding and an  
attempt to fall back failed. Please do this and that..."

-- 
olivier 
Received on Thursday, 29 November 2007 01:52:55 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:27 GMT