Re: ISSUE-88 - Change proposal (new update)

[I'm resending my message from 30 Apr 2010, with a properly formated 
keyword - ISSUE-88 (earlier I forgot the hyphen) so that the proposal 
gets listed on this issue's tracker page - 
http://www.w3.org/html/wg/tracker/issues/88.  Also corrected a typo.]

Updated change proposal:

Let multiple language tags continue to be legal.
(http://www.w3.org/html/wg/wiki/ChangeProposals/ContentLanguages)
  
== Summary ==
* Multiple language tags (a comma separated list) in @http-equiv 
  Content-Language continues to be legal.
* Conformance checkers will emit a warning whenever  – and only if – 
  the fallback language algorithm kicks in.
* The fallback warning will kick in regardless of whether the fallback 
  comes from HTTP or Content-Language.

== Rationale ==
The problems with the current specification are

1. That it prevents authors from legally using multiple values to 
   replicate the language fallback effect of doing the same thing 
   in a HTTP header. 
  * That no language gets set, as HTML5 requires from multiple tags 
whether they occur in HTTP or in @http-equiv, is still an effect. The 
spec is therefore incorrect in claiming about the latter that “[for 
instance it only supports one language]”.
2. That it prevents @http-equiv from being used as a reference to what 
the HTTP Content-Language is/was meant to be. 
  * Consider Firefox’ Page Info panel. Consider some CMSes. Consider 
simply authors themselves.
3. That it underlines the confusion that may exist today, about the 
nature of @lang versus Content-Language, by requiring:
  * different syntax rules for features that are expected to be 
identical (HTTP and @http-equiv )
  * similar syntax rules for features that are different (http-equiv 
and lang) 
  * a warning message which asks authors to “use @lang instead” – as if 
they were juxtaposable alternatives.

Conformance checking and warnings are in place, but should be about the 
correct things.

1. The current warning about using @lang instead of Content-Language 
should be changed into a warning which informs that a fallback language 
measure has kicked in, and recommend that authors create a language 
declaration (via @lang) rather than relying on the fallback feature.  
This warning should be shown regardless of whether the fallback comes 
from @http-equiv or from the higher level (HTTP). Justification: Since 
it is a fallback feature, and with other semantics, there is no 
guarantee that the author has used it for the language effect.

2. To hold the syntax rules of HTTP (which permits multiple language 
tags) as the conforming ones (rather than those of @lang, which forbids 
multiple languages), will have the effect of underlining that @lang and 
Content-Language have different purposes. For instance, since the 
fallback algorithm doesn’t kick in whenever multiple languages are used 
in the pragma or on the server, there would not be any warning in these 
cases.

== Details ==
Proposed spec changes, to section [4.2.5.3 Pragma directives]:

Replace the following text
  ]]  Conformance checkers will include a warning if this pragma is 
used. Authors are encouraged to use the @lang attribute instead.[HTTP]  
[[

with the following
  ]]  The semantics of this pragma, as well as of the HTTP 
Content-Language header, are different from the semantics of the @lang 
attribute. [HTTP] Thus, there is no guarantee that the author 
consciously used either of them for setting the language. Therefore, 
conformance checkers will include a warning, whenever HTML5’s fallback 
language algorithm is activated, whether it is the higher protocol or 
this pragma that kicks in. Authors are informed about which language 
the document falls back to, and are encouraged to not rely on the 
fallback feature but to instead explicitly use the @lang attribute on 
the root element.  [[

After the following text,
  ]]  the content attribute must have a value consisting of a valid BCP 
47 language tag  [[

then add the following:
  ]]  , or a comma separated list of two or more BCP 47 language tags  
[[

Delete the following text:
  ]]  This pragma is not exactly equivalent to the HTTP 
Content-Language header, for instance it only supports one language.  [[


== Impact ==
=== Positive Effects ===
1. More stable: same syntax as before continues to be permitted. 
2. More permissive: authors, CMS-es and browsers can continue to take 
advantage of @http-equiv ’s ability to reference what the HTTP header 
is/was supposed to be, including replicating its fallback effect.
3. More correct: the difference between @lang and Content-Language is 
pointed out, while the link between @http-equiv and HTTP is emphasized.
4. More useful: a warning that a fallback feature has kicked in, is 
more useful than a warning which focuses on one of the places where the 
fallback language could potentially kick in from. Why tell authors to 
“use @lang insetad” if the author has already made sure that the @lang 
attribute is in place?

=== Negative Effects ===
none

=== Conformance Classes Changes ===
* For UAs: none, compared with the change that HTML5 already requires.
* For validators: They must validate a comma separated list as 
conforming. They must check when the fallback language algorithm is 
activated. 
* For the HTML5 spec: see the Details section above. 

=== Risks ===
In legacy UAs, there is a risk that multiple language tags cause them 
to report that the document is in a meaningless language. However, this 
is a low risk. And authors can avoid it by using the @lang and xml:lang 
attributes. This change proposal ensures that authors will continue to 
be encouraged to use lang, and not Content-Language, for setting the 
language.

== References ==
Section [14.12 Content-Language] of [RFC 2616]:
HTML4’s general [HTTP-EQUIV explanation]
HTML4, section [8.1.2 Inheritance of language codes]

Received on Wednesday, 5 May 2010 18:22:43 UTC