W3C home > Mailing lists > Public > www-validator@w3.org > December 2005

Re: Document without charset

From: olivier Thereaux <ot@w3.org>
Date: Fri, 9 Dec 2005 07:02:08 +0900
Message-Id: <52A5643B-E4CA-40F9-BFD6-605FF65FEA9A@w3.org>
Cc: www-validator@w3.org
To: Andreas Prilop <nhtcapri@rrzn-user.uni-hannover.de>


On 9 Dec 2005, at 00:53, Andreas Prilop wrote:

>
> Reference:
> http://validator.w3.org/check?uri=www.unics.uni-hannover.de/ 
> nhtcapri/test.htm
>
> The validator says:
>
> | Encoding:   utf-8
> | Sorry, I am unable to validate this document because [...]
> | it contained one or more bytes that I cannot interpret as utf-8
>
> This is not helpful!
> Why does the validator assume UTF-8 in the first place?

This is a bug.

There is a routine for the validator to try and detect the character  
encoding by all ways described by the specs. What it is supposed to  
do if it does not find anything is to output a warning about no  
character encoding found, and point to this documentation:
http://validator.w3.org/docs/help.html#faq-charset

The fact that instead, it defaults to utf-8 and does not output the  
warning is indeed a bug, introduced a few months ago it seems. We'll  
look into fixing it as soon as possible.

Thanks,
olivier
-- 
olivier Thereaux - W3C - http://www.w3.org/People/olivier/
W3C Open Source Software: http://www.w3.org/Status
Received on Thursday, 8 December 2005 22:02:18 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:20 GMT