- From: Julian Reschke <julian.reschke@gmx.de>
- Date: Sat, 13 Nov 2010 18:26:29 +0100
- To: "public-html@w3.org" <public-html@w3.org>
SUMMARY
The specification requires recipients to parse Content-Type headers in
<meta> elements in a way breaking HTTP's parsing rules.
The justification given is:
"Note: This requirement is a willful violation of the HTTP
specification (for example, HTTP doesn't allow the use of single quotes
and requires supporting a backslash-escape mechanism that is not
supported by this algorithm), motivated by the need for backwards
compatibility with legacy content."
...however tests show that Internet Explorer ([1]) does indeed obey the
HTTP parsing rules, so it's highly doubtful that it's actually needed
for "backwards compatibility".
RATIONALE
"Willful violations" should be restricted to cases where they are
actually needed in practice. Evidence shows this is not the case here.
DETAILS
Change Step 6 in the last part of
<http://dev.w3.org/html5/spec/Overview.html#content-type-sniffing> from:
-- cut --
6.
Process the next character as follows:
If it is a U+0022 QUOTATION MARK ('"') and there is a later
U+0022 QUOTATION MARK ('"') in s
If it is a U+0027 APOSTROPHE ("'") and there is a later U+0027
APOSTROPHE ("'") in s
Return the encoding corresponding to the string between this
character and the next earliest occurrence of this character.
If it is an unmatched U+0022 QUOTATION MARK ('"')
If it is an unmatched U+0027 APOSTROPHE ("'")
If there is no next character
Return nothing.
Otherwise
Return the encoding corresponding to the string from this
character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or U+003B
character or the end of s, whichever comes first.
-- cut --
to
-- cut --
6.
Process the next character as follows:
If it is a U+0022 QUOTATION MARK ('"') and there is a later
U+0022 QUOTATION MARK ('"') in s
Return the encoding corresponding to the string between this
character and the next earliest occurrence of this character.
If it is an unmatched U+0022 QUOTATION MARK ('"')
If it is an unmatched U+0027 APOSTROPHE ("'")
If there is no next character
Return nothing.
Otherwise
Return the encoding corresponding to the string from this
character to the first U+0009, U+000A, U+000C, U+000D, U+0020, or U+003B
character or the end of s, whichever comes first.
-- cut --
...and change the following note accordingly (the exact text for the
note depending on the decision for ISSUE-126).
IMPACT
1. Positive Effects
Removal of a "willful violation" that is not required at all.
No need to change IE's behavior; the notoriously hard to get-rid-of
legacy IE versions remain compliant.
2. Negative Effects
Non-IE UAs may have to change if they want to be compliant in handling
essentially invalid header field instances (a single quote never is part
of a charset name).
3. Conformance Classes Changes
Certain instances of meta/@http-equiv change their semantics.
4. Risks
The risk appears to be small, given the fact that IE already behaves the
way this Change Proposal describes.
REFERENCES
[1] <http://www.w3.org/Bugs/Public/show_bug.cgi?id=10805#c0>
Received on Saturday, 13 November 2010 17:27:15 UTC