Re: [VE][html5] charset encoding warnings

13.10.2011 13:09, Rune Schjellerup Philosof wrote:

> Your validator tells me that it is using windows-1252 instead of the
> declared encoding.

The W3C validator does so when in HTML5 mode, reflecting what 
validator.nu does.

> However, it doesn’t tell me why!

Neither does validator.nu. I’ve been told that in such situations, the 
problems should be reported in the bug reporting system for 
validator.nu, as the “HTML5 mode” of the W3C Markup Validator is 
effectively just a clone of the validator.nu code.

> I am assuming that I have some characters in my response that is not
> valid in iso-8859-1, so I would like a warning about those instead (or
> in addition).
>
> Validating http://jul2011.tv2.dk/

The page does not contain bytes that do not represent characters by the 
iso-8859-1 encoding; the only bytes outside the Ascii range there are 
those for “æ”, “ö”, and “å”.

The warning issued is a generic one, issued _always_ (in HTML5 mode) 
when the declared encoding is iso-8859-1. The reason is that by the 
HTML5 drafts, browsers are required to treat iso-8859-1 as windows-1252. 
This is in fact what browsers generally do, so declaring iso-8859-1 is 
in a sense pointless.

 From a different perspective, this is just madness – kludgy error 
recovery turned to a standard and an international, widely deployed 
standard flushed down the toilet in favor of a vendor-specific encoding.

If you ask me, anyone declaring his HTML document to be iso-8859-1 
encoded, whether consciously or unconsciously (by accepting server 
defaults for example), deserves to get a warning when the data is not in 
fact in that encoding. I’ve added a suggestion on this into the 
validator.nu bug reporting database at
http://bugzilla.validator.nu/show_bug.cgi?id=95

-- 
Yucca, http://www.cs.tut.fi/~jkorpela/

Received on Saturday, 15 October 2011 10:55:32 UTC