Re: Null change proposal for ISSUE-88 (mark II)

Jonas Sicking, Sun, 4 Apr 2010 21:03:04 -0700:
> On Sun, Apr 4, 2010 at 7:55 PM, Leif Halvard Silli
> <> wrote:
>> Leif Halvard Silli, Sun, 4 Apr 2010 04:37:55 +0200:
>>> Ian Hickson, Sat, 3 Apr 2010 22:38:12 +0000 (UTC):
>>>> On Sat, 3 Apr 2010, Julian Reschke wrote:
>>>>> On 04.04.2010 00:34, Anne van Kesteren wrote:
>>>>>> On Sat, 03 Apr 2010 02:00:32 -0700, Julian Reschke wrote:
>>>>>>> The attribute is an HTML attribute, but it's value space is defined by
>>>>>>> the HTTP header registry.
>>   [...]
>>>> http-equiv isn't anything to do with HTTP in practice. HTML5 just makes
>>>> that clear. Ideally we'd drop the whole attribute, but 
>>>> unfortunately there
>>>> are some pragmas that are needed for backwards-compatibility. I 
>>>> agree that
>>>> some people will object (indeed, you have already objected). What matters
>>>> isn't whether anyone agrees, what matters is that we make the right
>>>> technical decisions that are compatible with reality.
>>> I am arguing that to continue to allow white-space as well as continue
>>> to allow a comma separated list is more compatible with reality, than
>>> forbidding one or both. Bug 9264. Your reaction to Bug 9264 was that I
>>> should file bugs against user agents! (To "save" the spec.) Why should
>>> I file bugs against vendors if your spec matches user agent reality?
>> I have reopened bug 9264, under a new title,
>> "There should be a link/border between [the] META content-language
>>  algorithm and HTTP content-language headers"
>> because Mozilla browsers (which were the background for bug 9264)
>> actually behave according to the HTML5 draft.

> For what it's worth, I think we at mozilla would be quite happy to
> change our behavior, as always. However, as always, it's under the
> condition that it
> 1. Improves behavior over what we are currently doing.
> 2. Doesn't break too many pages.
> Obviously both these points are quite subjective. I can't give an
> answer to "how many is too many?", nor is it always easy to say what
> is an "improvement".

To day I have filed a bunch of bugs related to META content-language. 
The picture I see is this:

Bug 9420: The legal syntax of <META http-equive="content-language" 
content="*"> - should permit a comma separated list. And the semantics 
of the empty string should be the same as it is for lang="<empty>"  and 
xml:lang="<empty>". The semantics of the empty string is important: 
Currently Mozilla and the spec requires the empty string to trigger 
that the user agent go looking in the content-language header from the 

Bug 9422: Mozilla and <META http-equiv="content-language" 
content="<emptystring>" >. Mozilla needs to make a small change so that 
it treats the empty string correctly. 

Bug 9411: Say that it is the last META content-language declaration 
which takes precedence. (Same pattern as found for 
http-equiv="default-style". See bug 9409 and bug 9410.) Currently, 
HTML5 requires that the empty string as well as a comma separated list 
trigger the UA to visit the HTTP header from the server. Which would 
lead to many more visits to the server header than today. Instead, the 
focus should be on making the first detected META element the *end 
station*. See bug 9417 also, below.

Bug 9424: Conformance checking of the syntax of <META 
http-equiv="content-language" content="*">: Incorrect syntax in the 
*last* META content-language should trigger an error in conformance 
checkers. But if the syntax error occurs in a earlier META 
content-language element, only a warning should be shown. Thus: if e.g. 
the second last META c-l element contains whitespace only (in order to 
have full control over legacy Mozilla browser - bug 9422 above), this 
should only cause a warning. (Most users/authors would not care - but 
at any rate, it would be possible - by accepting to get a warning! - to 
solve the Mozilla issue for those that do care.)

Bug 9417: Make the algorithm for META content-language and lang="*" as 
equal as possible. (Actually: make them 100% equal.)
Currently, if the content of META c-l e.g. is the string "en,fr", then 
HTML5 requires user agents to ignore the META c-l element. (Whereas if 
it contains other errors, such as "en+fr" or "en*fr", then there is no 
such requirement to ignore it.) This is illogical. User agents should 
treat them the same way. *Especially* as long as the attitude is to 
align content-language with lang="*" (by requiring that only a single 
language tag is permitted).  E.g. if we have <element lang="en,fr">, 
then user agents are required to think that the language is a single 
language known as "en,fr" - and content-language should be treated the 
same way. (As long as we do not change META c-l to permit a list - bug 
9426 below.)

Finally, I have filed a proposal (not bug) - as bug 9426: Define an 
algorithm for how to extract a language/languages from a comma 
separated content-language list. 

When it comes to Mozilla, then if Bug 9426 is accepted, Mozilla would 
be guaranteed to not need to change (very much). But if bug 9426 is not 
accepted, and instead bug 9417 in its entirety is accepted, then 
Mozilla would have to stop treating "en,fr" as two different language 
tags, and instead treat it as a single, illegal, language tag (which is 
how other user agents already behave.)  

Already Maciej, has expressed willingness to change Webkit so that it 
behave like Mozilla. Whereas the I18N wg has expressed a wish for a 
more specific solution where only the first language tag is 
significant. I don't know what the stake holderes in Opera, Internet 
Explorer, Chrome, Konqueror etc think - if they would be willing to 
change to the Mozilla behaviour. Clearly the Mozilla behaviour is more 
fault tolerant. 

I do not think bug 9426 is the most important to solve - it is farm 
more important to solve the other bugs.

Comments are welcome.
leif halvard silli

Received on Monday, 5 April 2010 23:17:07 UTC