- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Wed, 5 May 2010 20:22:05 +0200
- To: Maciej Stachowiak <mjs@apple.com>
- Cc: "Phillips, Addison" <addison@lab126.com>, "public-html@w3.org" <public-html@w3.org>, "public-i18n-core@w3.org" <public-i18n-core@w3.org>, www-international@w3.org
[I'm resending my message from 30 Apr 2010, with a properly formated keyword - ISSUE-88 (earlier I forgot the hyphen) so that the proposal gets listed on this issue's tracker page - http://www.w3.org/html/wg/tracker/issues/88. Also corrected a typo.] Updated change proposal: Let multiple language tags continue to be legal. (http://www.w3.org/html/wg/wiki/ChangeProposals/ContentLanguages) == Summary == * Multiple language tags (a comma separated list) in @http-equiv Content-Language continues to be legal. * Conformance checkers will emit a warning whenever – and only if – the fallback language algorithm kicks in. * The fallback warning will kick in regardless of whether the fallback comes from HTTP or Content-Language. == Rationale == The problems with the current specification are 1. That it prevents authors from legally using multiple values to replicate the language fallback effect of doing the same thing in a HTTP header. * That no language gets set, as HTML5 requires from multiple tags whether they occur in HTTP or in @http-equiv, is still an effect. The spec is therefore incorrect in claiming about the latter that “[for instance it only supports one language]”. 2. That it prevents @http-equiv from being used as a reference to what the HTTP Content-Language is/was meant to be. * Consider Firefox’ Page Info panel. Consider some CMSes. Consider simply authors themselves. 3. That it underlines the confusion that may exist today, about the nature of @lang versus Content-Language, by requiring: * different syntax rules for features that are expected to be identical (HTTP and @http-equiv ) * similar syntax rules for features that are different (http-equiv and lang) * a warning message which asks authors to “use @lang instead” – as if they were juxtaposable alternatives. Conformance checking and warnings are in place, but should be about the correct things. 1. The current warning about using @lang instead of Content-Language should be changed into a warning which informs that a fallback language measure has kicked in, and recommend that authors create a language declaration (via @lang) rather than relying on the fallback feature. This warning should be shown regardless of whether the fallback comes from @http-equiv or from the higher level (HTTP). Justification: Since it is a fallback feature, and with other semantics, there is no guarantee that the author has used it for the language effect. 2. To hold the syntax rules of HTTP (which permits multiple language tags) as the conforming ones (rather than those of @lang, which forbids multiple languages), will have the effect of underlining that @lang and Content-Language have different purposes. For instance, since the fallback algorithm doesn’t kick in whenever multiple languages are used in the pragma or on the server, there would not be any warning in these cases. == Details == Proposed spec changes, to section [4.2.5.3 Pragma directives]: Replace the following text ]] Conformance checkers will include a warning if this pragma is used. Authors are encouraged to use the @lang attribute instead.[HTTP] [[ with the following ]] The semantics of this pragma, as well as of the HTTP Content-Language header, are different from the semantics of the @lang attribute. [HTTP] Thus, there is no guarantee that the author consciously used either of them for setting the language. Therefore, conformance checkers will include a warning, whenever HTML5’s fallback language algorithm is activated, whether it is the higher protocol or this pragma that kicks in. Authors are informed about which language the document falls back to, and are encouraged to not rely on the fallback feature but to instead explicitly use the @lang attribute on the root element. [[ After the following text, ]] the content attribute must have a value consisting of a valid BCP 47 language tag [[ then add the following: ]] , or a comma separated list of two or more BCP 47 language tags [[ Delete the following text: ]] This pragma is not exactly equivalent to the HTTP Content-Language header, for instance it only supports one language. [[ == Impact == === Positive Effects === 1. More stable: same syntax as before continues to be permitted. 2. More permissive: authors, CMS-es and browsers can continue to take advantage of @http-equiv ’s ability to reference what the HTTP header is/was supposed to be, including replicating its fallback effect. 3. More correct: the difference between @lang and Content-Language is pointed out, while the link between @http-equiv and HTTP is emphasized. 4. More useful: a warning that a fallback feature has kicked in, is more useful than a warning which focuses on one of the places where the fallback language could potentially kick in from. Why tell authors to “use @lang insetad” if the author has already made sure that the @lang attribute is in place? === Negative Effects === none === Conformance Classes Changes === * For UAs: none, compared with the change that HTML5 already requires. * For validators: They must validate a comma separated list as conforming. They must check when the fallback language algorithm is activated. * For the HTML5 spec: see the Details section above. === Risks === In legacy UAs, there is a risk that multiple language tags cause them to report that the document is in a meaningless language. However, this is a low risk. And authors can avoid it by using the @lang and xml:lang attributes. This change proposal ensures that authors will continue to be encouraged to use lang, and not Content-Language, for setting the language. == References == Section [14.12 Content-Language] of [RFC 2616]: HTML4’s general [HTTP-EQUIV explanation] HTML4, section [8.1.2 Inheritance of language codes]
Received on Wednesday, 5 May 2010 18:22:46 UTC