W3C home > Mailing lists > Public > public-html@w3.org > March 2011

Re: ISSUE-126: charset-vs-backslashes - Straw Poll for Objections

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 06 Mar 2011 18:43:21 +0100
Message-ID: <4D73C7B9.6020300@gmx.de>
To: Philip J├Ągenstedt <philipj@opera.com>
CC: public-html@w3.org
On 06.03.2011 17:19, Philip J├Ągenstedt wrote:
> ...
>> My goals would be:
>>
>> - either align parsing with HTTP; *or* be clear that this is specific
>> to META, and consumers will need different parsing rules for the two
>> protocol elements.
>>
>> - in the latter case, rephrase and possibly move the text we're
>> discussing so it becomes crystal clear that this is error handling,
>> and *only* applies to <meta>.
>>
>> - make sure that field values that are syntactically valid in HTTP and
>> conforming in HTML have the same interpretation.
>>
>> - clarify how the two sets described above differ (for instance, if
>> backslash doesn't do the same thing as in quoted-string it should be
>> profiled out in HTML, this may already be the case).
>
> All of this seems reasonable, if done with restraint. For example, I
> don't think there's any point in handling backslash escaping, as no
> encoding names include characters that need escaping, right?
> ...

It's correct it's not needed for any valid encoding name. Thus claiming 
it MUST NOT be done is simply silly, right? In practice, it will never 
be an issue for valid encoding names, and thus I was surprised by the 
claim it is.

>> - get rid of claims that things are done for backwards compatibility
>> when we have proof this is not the case.
>
> Have you done testing of the sum of the changes necessary to make
> processing comply exactly with HTTP? It's plausible that the impact of
> backslash escaping and quote style is limited, but I find it very hard
> to believe that changing the way the charset parameter is located to
> follow HTTP would not have legacy compat issues.

No, I don't have tested that.

>> BTW:
>>
>> content='text/html; charset = UTF-8' (whitespace between attribute and
>> value)
>>
>> is syntactically legal per RFC 2616 (although we may have broken it in
>> HTTPbis, just opened a ticket).
>
> Perhaps I'm misreading <http://tools.ietf.org/html/rfc2616#section-3.7>?
> The ABNF does not allow for it and the prose says "Linear white space
> (LWS) MUST NOT be used between the type and subtype, nor between an
> attribute and its value."

Oops. Indeed. Seems I have to update a few tests in 
<http://greenbytes.de/tech/tc2231/>.

Best regards, Julian
Received on Sunday, 6 March 2011 17:44:00 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:23 UTC