W3C home > Mailing lists > Public > www-validator@w3.org > March 2008

Re: Validator charset

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Tue, 11 Mar 2008 10:41:26 +0100
To: www-validator@w3.org
Message-ID: <fr5jvg$8gb$1@ger.gmane.org>

olivier Thereaux wrote:

> http://www.w3.org/QA/2008/03/html-charset.html

Nice, comment sent.  I'm always interested when folks claim
to know the "rough consensus".  In the IETF they have area
directors and WG Chairs entitled to use these magic words,
while the community has appeals and a "recall procedure" to
challenge them when any "rough consensus" decree by the PTB
is too far out of line... ;-)

> AFAIK the w3c markup validator is not following the
> recommendation of trying a iso-8859-1 fallback (as is the 
> rule, kinda, for text/*) because... it's just a bad one.

Indeed, go and tell the 2616bis folks.  At the moment the 
state is apparently "if there's no consensus to fix it, we
keep it as is" <sigh />

>> You can stop reading now if you don't want to be bored.
> Sorry, I'm not bored yet :).

Nor me, I'm just too lazy to look into the sources.  Please
correct me if I have it wrong, I think the validator never
really looks at the <?xml encoding="..." to figure out what
the encoding is, it only verifies that this matches what it
has "divined" elsewhere, e.g., based on lying HTTP servers.

If that is the case several catgories of MAMA vs. validator
differences are a waste of time, using info as input that
is no input for the validator's "divination" at the moment.

I recall the case where the validator insisted on US-ASCII
for text/xml even for <?xml encoding="utf-8" ... ?>

 Frank
Received on Tuesday, 11 March 2008 09:39:36 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:28 GMT