- From: Jukka K. Korpela <jkorpela@cs.tut.fi>
- Date: Sat, 15 Oct 2011 13:54:53 +0300
- To: Rune Schjellerup Philosof <rusp@tv2.dk>
- CC: "www-validator@w3.org" <www-validator@w3.org>
13.10.2011 13:09, Rune Schjellerup Philosof wrote: > Your validator tells me that it is using windows-1252 instead of the > declared encoding. The W3C validator does so when in HTML5 mode, reflecting what validator.nu does. > However, it doesn’t tell me why! Neither does validator.nu. I’ve been told that in such situations, the problems should be reported in the bug reporting system for validator.nu, as the “HTML5 mode” of the W3C Markup Validator is effectively just a clone of the validator.nu code. > I am assuming that I have some characters in my response that is not > valid in iso-8859-1, so I would like a warning about those instead (or > in addition). > > Validating http://jul2011.tv2.dk/ The page does not contain bytes that do not represent characters by the iso-8859-1 encoding; the only bytes outside the Ascii range there are those for “æ”, “ö”, and “å”. The warning issued is a generic one, issued _always_ (in HTML5 mode) when the declared encoding is iso-8859-1. The reason is that by the HTML5 drafts, browsers are required to treat iso-8859-1 as windows-1252. This is in fact what browsers generally do, so declaring iso-8859-1 is in a sense pointless. From a different perspective, this is just madness – kludgy error recovery turned to a standard and an international, widely deployed standard flushed down the toilet in favor of a vendor-specific encoding. If you ask me, anyone declaring his HTML document to be iso-8859-1 encoded, whether consciously or unconsciously (by accepting server defaults for example), deserves to get a warning when the data is not in fact in that encoding. I’ve added a suggestion on this into the validator.nu bug reporting database at http://bugzilla.validator.nu/show_bug.cgi?id=95 -- Yucca, http://www.cs.tut.fi/~jkorpela/
Received on Saturday, 15 October 2011 10:55:32 UTC