- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Fri, 9 Apr 2010 10:23:48 +0200
- To: HTMLwg <public-html@w3.org>
(I am stilling waiting for the chairs’ acknowledgement.)
ISSUE 88
========
HTML5 Change Proposal for Content-Language
http://www.w3.org/html/wg/wiki/ChangeProposals/lang_versus_contentLanguage
Date: 9th of April.
Summary
-------
* Only the last occurring meta content-language counts w.r.t.
authoring conformance.
* The value of the content attribute of the last occurring meta
content-language element must be the empty string.
* The value of the content attribute in possible preceding meta
content-language elements should conform to RFC2616 – and validators
may validate the possible preceding elements for RFC 2616 conformance.
However, only the value of the last occurring meta content-language
element has any bearing on the document’s HTML5 validity.
* Ian’s language determination algorithm is changed in one point: If
the last occurring meta content-language declaration is empty, then it
must be interpreted by user agents as having the same semantics as an
empty lang or xml:lang attribute – meaning that they must not ask if
the HTTP header has any other language information to provide. (Thus,
only when the last occurring meta declaration contains multiple
language tags, would conforming user agents be required to pay
attention to whether the HTTP header contains a language tag or not.)
Rationale
---------
* The last occurring meta content-language element always wins in
current user agents – let’s spec this.
* At the same time, as Ian explains in his change proposal variants,
interpretation of content-language differs across browsers.
* The safest value is the empty string, as this value doesn’t
interfere with with how user agents interpret lang and xml:lang. Most
user agents already interpret this value in accordance with this change
proposal. (Only Gecko treats it in accordance with Ian’s zero change
proposal.) Therefore, only the empty string should be considered
conforming (in the last occurring meta declaration). Through this,
authors see for themselves that they must apply the lang attribute
whenever they want to declare the language of the document.
* By not counting the value of possible preceding meta
content-language elements when HTML5 conformance is evaluated, we
satisfy two communities: the I18N community (who want to be able to use
multiple values) and authors wanting to create HTML5 documents that
works in Mozilla browsers (they want to be able to cancel the effect of
HTTP headers in Gecko)
* By treating the empty string in the content attribute as equal to
an empty lang attribute, we simplify the algorithm for user agents –
this is already how all – except Gecko – work. In the same go, we also
maintain things more predictable for authors.
Details
-------
1. The authoring requirements for meta content-language must change,
as described above.
2. The language determination algorithm must change as described
above.
Impact
------
* Predictability: Authors have experience with how things works
today. And this proposal is the best match with current reality. The
empty string is the meta content-language value with best cross browser
compatibility..
* We allow those in the know to follow RFC 2616 and/or fix the issues
with Gecko by reserving preceding meta content-language elements for
this.
* We send a strong signal – a requirement to eventually use an empty
meta content-language element! – about the need to use lang for setting
the language of the document.
* We allow authors to make use of HTML5’s semantics of the empty
<code>lang</code> attribute in many current browsers, and put weight on
authors and vendors to implement this new semantic feature of lang.
Risks
-----
* None.
References
----------
How meta content-language affects different browsers.
IE8 edge mode
-------------
1. IE8 in edge mode understands the CSS :lang(*) selector.
2. It interpret both the meta declaration and the HTTP header.
3. It doesn’t let the interpretation of an empty lang be affected by
the content-language meta declaration and/or the HTTP header.
Gecko
-----
1. Gecko does respect the semantics of the empty lang. Thus, in a
page where all the language information ''only'' arrives from lang or
xml:lang (that is: no meta content-language which Gecko is able to read
is present), the CSS selector
div[lang=""]:lang(en){background:red}</code>
does – as it is the correct behavior – not work. [1]
2. But Gecko (Firefox version 2 and onwards) is immediately affected
if a meta content-language declaration with a language tag is inserted.
[2]
3. At the same time, Gecko doesn’t treat an empty meta
content-language declaration the same way that it treats an empty lang.
In this case, instead of accepting that the language is unknown (like
IE8, KHTML, Webkit, Chrome and Opera ), it either looks at the
preceding meta (if any). [3]
4. Or, when there is no meta, it looks at the HTTP header, if any. [4]
5. These issues can be corrected by inserting a cancelling code in
the preceding (the second last occurring) meta content-language
declaration. [5].
6. With these authoring guides, one can also use multiple values,
without any negative effect. [6]
KHTML, Webkit, Chrome
---------------------
1. These browsers does not look at the HTTP header. They also treat
the empty meta content-language like they treat an empty lang. But
these browsers have a bug in that they do not respect the semantics of
the empty lang. [7]
2. They treat the meta content-language element the same way. (And
then the Mozilla bug also kicks in.) [8]
3. Thus, from these browser’s point of view, the requirement that the
last occurring meta content-language must be empty, is often
irrelevant, as long as the author has used a non-empty lang on the root
element.
4. But when authors do not use a non-empty lang on the root element,
then the requirement that the last occurring meta content-language
element must be empty, can still be useful when creating cross browser
solutions which try to be compatible with KHTML, Webkit and Chrome as
well.
Opera
-----
* Opera also has issues with how it reacts to the meta
content-language values. Thus this change proposal is also useful for
current versions of Opera.
Other browsers
--------------
* I have so far not been able to test other browsers with CSS
*:lang(*) support.
[1]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit
[2]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl
[3]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty
[4]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty-http
[5]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty-http-cancel
[6]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty-http-cancel-multiple
[7]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/kwc-lang
[8]
http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/kwc-cl
--
leif halvard silli
Received on Friday, 9 April 2010 08:24:25 UTC