Re: CfC: close ISSUE-125 charset-vs-quotes by amicable resolution from Julian Reschke on 2011-01-23 (public-html@w3.org from January 2011)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 23 Jan 2011 13:58:21 +0100
To: Ian Hickson <ian@hixie.ch>
CC: Anne van Kesteren <annevk@opera.com>, HTML WG <public-html@w3.org>, Sam Ruby <rubys@intertwingly.net>
Message-ID: <4D3C25ED.9060907@gmx.de>

On 18.01.2011 20:42, Ian Hickson wrote:
> On Tue, 18 Jan 2011, Anne van Kesteren wrote:
>> On Tue, 18 Jan 2011 17:25:51 +0100, Sam Ruby<rubys@intertwingly.net>  wrote:
>>> As we have received no counter-proposals or alternate proposals, the
>>> chairs are issuing a call for consensus on the proposal that we do
>>> have.  If no objections are raised to this call by 26 January 2011, we
>>> will direct the editor to make the proposed change.  If anybody would
>>> like to raise an objection during this time, we strongly encourage
>>> them to accompany their objection with a concrete and complete change
>>> proposal.
>>
>> If we accept that the rules should be the same as in HTTP we should just
>> reference HTTP instead so it is more clear the same code path can be
>> used.
>
> We can't, because HTTP doesn't define how you parse invalid headers.

You could define just the error handling, or have a complete definition 
that doesn't violate the base spec.

That being said, a single quote *is* a valid token character, so handling

   charset='UTF-8'

isn't about error handling at the parsing level at all. It *might* be 
about error handling in extracting charset names from parameters, as 
legal charset names contain a single quote.

> Also, it's clear that what HTTP does define is definitely not compatible
> with what browsers implement. It's less clear how much Julian's proposal
> for ISSUE-125 differs from what browsers implement (from what I can tell,
> it matches a different set of browsers than what the spec says, but is not
> a substantial improvement).

I think we should keep in mind what the CP is about. Reminder:

> In <http://dev.w3.org/html5/spec/Overview.html#content-type-sniffing>, the spec claims that treating single quotes like double quotes in Content-Type (a violation of the syntax defined in RFC 2616) is "motivated by the need for backwards compatibility with legacy content".

If the design goal is needed for compat with existing client, you'd need 
to prove it's actually needed. The fact that IE disagrees shows that 
that argument is flawed.

You could rephrase the explanation, admitting it's only because you want 
to standardize on something that "more" UAs do.

You could state that parsing http-equiv is only *similar* to parsing the 
matching HTTP header field, and thus HTML is free to modify the parsing.

> I would encourage browser vendors to consider -125 and -126 in terms of
> what they are willing to implement. It is my intent to not write CCPs for
> these issues and to just update the spec in a few years to match whatever
> browsers have converged on.

I would encourage browser vendors to get rid of funky special-cases 
whenever they can; and the IE behavior shows that it doesn't seem to be 
needed in practice (note that I'm not saying that IE does it right, it 
just doesn't do what the spec says and apparently doesn't have to deal 
with compat issues despite what it's doing).

Best regards, Julian

Received on Sunday, 23 January 2011 12:59:24 UTC