- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Fri, 9 Apr 2010 10:23:48 +0200
- To: HTMLwg <public-html@w3.org>
(I am stilling waiting for the chairs’ acknowledgement.) ISSUE 88 ======== HTML5 Change Proposal for Content-Language http://www.w3.org/html/wg/wiki/ChangeProposals/lang_versus_contentLanguage Date: 9th of April. Summary ------- * Only the last occurring meta content-language counts w.r.t. authoring conformance. * The value of the content attribute of the last occurring meta content-language element must be the empty string. * The value of the content attribute in possible preceding meta content-language elements should conform to RFC2616 – and validators may validate the possible preceding elements for RFC 2616 conformance. However, only the value of the last occurring meta content-language element has any bearing on the document’s HTML5 validity. * Ian’s language determination algorithm is changed in one point: If the last occurring meta content-language declaration is empty, then it must be interpreted by user agents as having the same semantics as an empty lang or xml:lang attribute – meaning that they must not ask if the HTTP header has any other language information to provide. (Thus, only when the last occurring meta declaration contains multiple language tags, would conforming user agents be required to pay attention to whether the HTTP header contains a language tag or not.) Rationale --------- * The last occurring meta content-language element always wins in current user agents – let’s spec this. * At the same time, as Ian explains in his change proposal variants, interpretation of content-language differs across browsers. * The safest value is the empty string, as this value doesn’t interfere with with how user agents interpret lang and xml:lang. Most user agents already interpret this value in accordance with this change proposal. (Only Gecko treats it in accordance with Ian’s zero change proposal.) Therefore, only the empty string should be considered conforming (in the last occurring meta declaration). Through this, authors see for themselves that they must apply the lang attribute whenever they want to declare the language of the document. * By not counting the value of possible preceding meta content-language elements when HTML5 conformance is evaluated, we satisfy two communities: the I18N community (who want to be able to use multiple values) and authors wanting to create HTML5 documents that works in Mozilla browsers (they want to be able to cancel the effect of HTTP headers in Gecko) * By treating the empty string in the content attribute as equal to an empty lang attribute, we simplify the algorithm for user agents – this is already how all – except Gecko – work. In the same go, we also maintain things more predictable for authors. Details ------- 1. The authoring requirements for meta content-language must change, as described above. 2. The language determination algorithm must change as described above. Impact ------ * Predictability: Authors have experience with how things works today. And this proposal is the best match with current reality. The empty string is the meta content-language value with best cross browser compatibility.. * We allow those in the know to follow RFC 2616 and/or fix the issues with Gecko by reserving preceding meta content-language elements for this. * We send a strong signal – a requirement to eventually use an empty meta content-language element! – about the need to use lang for setting the language of the document. * We allow authors to make use of HTML5’s semantics of the empty <code>lang</code> attribute in many current browsers, and put weight on authors and vendors to implement this new semantic feature of lang. Risks ----- * None. References ---------- How meta content-language affects different browsers. IE8 edge mode ------------- 1. IE8 in edge mode understands the CSS :lang(*) selector. 2. It interpret both the meta declaration and the HTTP header. 3. It doesn’t let the interpretation of an empty lang be affected by the content-language meta declaration and/or the HTTP header. Gecko ----- 1. Gecko does respect the semantics of the empty lang. Thus, in a page where all the language information ''only'' arrives from lang or xml:lang (that is: no meta content-language which Gecko is able to read is present), the CSS selector div[lang=""]:lang(en){background:red}</code> does – as it is the correct behavior – not work. [1] 2. But Gecko (Firefox version 2 and onwards) is immediately affected if a meta content-language declaration with a language tag is inserted. [2] 3. At the same time, Gecko doesn’t treat an empty meta content-language declaration the same way that it treats an empty lang. In this case, instead of accepting that the language is unknown (like IE8, KHTML, Webkit, Chrome and Opera ), it either looks at the preceding meta (if any). [3] 4. Or, when there is no meta, it looks at the HTTP header, if any. [4] 5. These issues can be corrected by inserting a cancelling code in the preceding (the second last occurring) meta content-language declaration. [5]. 6. With these authoring guides, one can also use multiple values, without any negative effect. [6] KHTML, Webkit, Chrome --------------------- 1. These browsers does not look at the HTTP header. They also treat the empty meta content-language like they treat an empty lang. But these browsers have a bug in that they do not respect the semantics of the empty lang. [7] 2. They treat the meta content-language element the same way. (And then the Mozilla bug also kicks in.) [8] 3. Thus, from these browser’s point of view, the requirement that the last occurring meta content-language must be empty, is often irrelevant, as long as the author has used a non-empty lang on the root element. 4. But when authors do not use a non-empty lang on the root element, then the requirement that the last occurring meta content-language element must be empty, can still be useful when creating cross browser solutions which try to be compatible with KHTML, Webkit and Chrome as well. Opera ----- * Opera also has issues with how it reacts to the meta content-language values. Thus this change proposal is also useful for current versions of Opera. Other browsers -------------- * I have so far not been able to test other browsers with CSS *:lang(*) support. [1] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit [2] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl [3] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty [4] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty-http [5] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty-http-cancel [6] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/lang-inherit-cl-empty-http-cancel-multiple [7] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/kwc-lang [8] http://malform.no/testing/html5/attr-lang/mozilla-lang-lottery/kwc-cl -- leif halvard silli
Received on Friday, 9 April 2010 08:24:25 UTC