W3C home > Mailing lists > Public > www-validator@w3.org > March 2008

Re: Validator charset

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Wed, 12 Mar 2008 01:33:36 +0100
To: www-validator@w3.org
Message-ID: <fr788c$qpu$1@ger.gmane.org>

olivier Thereaux wrote:
 
> I wrote "rough consensus" because "the consensus as it seems
> to have happened, based on the stories I have heard and the
> eventual result" would be too long :)

Yeah, a mental trick of server admins thinking that folks not
running their own server, or at least arranging write access
on dot-files, are by definition too silly to talk about... ;-)
It reminds me of Fidonet.  They have a point, but there are
so many users, and only a few server admins (in comparison).

> Do you have a pointer to discussions on the http-wg list?

http://thread.gmane.org/gmane.org.w3c.miscellaneous/1223/focus=880

> I would agree that the current recommendation of the HTTP
> spec is broken, and I suspect everyone agrees with that. I
> also suspect that finding a solution that works while 
> following certain constraints (backward compatibility?)
> must be a nasty headache.

Yes, a headache, one hard constraint is no new MUST, 2616bis
is supposed to stay at "draft standard", that permits to fix
errors, remove unused features, clarify, arguably twist MAY
into SHOULD NOT or similar, but no "good" 2616 implementation
can end up as "broken" as far as 2616bis is concerned, and of
course no HTTP/1.2 or similar stunts, that would violate the
WG Charter (apart from being a rathole and bad idea).

> I think the validator does look at the xml declaration as
> a source. See e.g the following test case:
> http://qa-dev.w3.org/wmvs/HEAD/dev/tests/charset-xmldecl.xhtml

Valid and UTF-8, do you have a similar test not using UTF-8 ?
With a default UTF-8 it is not obvious what triggered UTF-8.

My example was <http://xyzzy.webhop.info/home/ltru/4645bisU.xml>
sending text/xml without charset resulting in US-ASCII and a
fatal validation error for the UTF-8 XML.  Another server
sends application/xml without charset, there I get valid and
UTF-8.  For authors the only safe bet is using US-ASCII for
XML, sometimes "I18N" is odd... :-)  Warning, huge test file,
and its content is now obsolete. 

 Frank
Received on Wednesday, 12 March 2008 00:31:37 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 25 April 2012 12:14:28 GMT