- From: Ian Hickson <ian@hixie.ch>
- Date: Sun, 4 Apr 2010 01:01:53 +0000 (UTC)
- To: public-html@w3.org
ISSUE-88 ======== SUMMARY There is no problem and the proposed remedy is to change nothing. RATIONALE There is no problem. Another change proposal suggests adding a note on the basis that we should clarify why the HTTP and pragma declarations are different when it comes to values, and how they should be used, suggesting that this is a constant source of confusion. However, no evidence has been provided to suggest that this really is a source of confusion. Furthermore, the suggested note is wrong in practice. The pragma doesn't give metadata about the document. The original intent of the <meta http-equiv> feature was to provide a way for _servers_ to include data in their HTTP headers on a per-file basis; this isn't document-wide metadata for user agents, it's for servers. This original intent also doesn't match reality; reality is that this pragma sets the default language for lang="", which also isn't document-wide metadata for user agents. Finally, the proffered note does not actually match the associated rationale: it doesn't explain why the HTTP and pragma declaration syntaxes are different; instead it talks about a "language" attribute. If there is a "constant source of confusion", then what we need is pointers to this confusion, so that text intended specifically to address that confusion is included in the spec. It is quite possible that we could add lots of explanatory text and explain the situation in detail, but to do so we need to know what the confusion is about. As far as I am aware, no bug pointing to confusion on this subject and asking for clarification has been rejected, which makes using the change proposal process inappropriate. The same change proposal also suggests a second change, namely to change the syntax to allow multiple comma-separated language codes, even though all but the first would be ignored. User agents vary in their handling of the Content-Language pragma. Some user agents support a comma-separated list as meaning (contrary to the intent of the Content-Language HTTP header) that the root element and its descendants, in the absence of any lang="" attribute, are in multiple languages. This seems to contradict the model expected by the :lang selector and by the lang="" attribute, which assume that each element has a single language. Other user agents treat the comma as part of the language tag, for example treating <meta http-equiv="Content-Language" content="en,fr"> as setting a pragma-set default language of "en,fr", which can be matched by a selector such as ":lang(en\,fr)", and specifically _not_ by ":lang(en)". (The specification's UA conformance criteria propose a compromise model wherein the user agents aren't required to support multiple languages per element, but still interpret the comma correctly, rather than treating it as part of the language code.) Because of the way some legacy UAs handle this pragma, and because the behaviour of conforming UAs drops all but the first language, it would be ill advised for us to make multiple values conforming. The way to mark that a document _uses_ multiple languages in such a way that user agents can actually parse and find this information is to use the lang="" attribute in the document. Putting multiple values in the pragma would fail to handle this according to the proposal. Another possible use case would be to to have a standard way to say who the target audience of the document is, but in practice few people use that information on the Web, so it doesn't seem like having a pragma that exposes this information would be useful, even if we ignore that the user agents are currently required to ignore that information. Even if there was such a need, this feature would be a bad way to provide that information, since it is used in an incompatible way by user agents (they use this information to determine processing behaviour -- none of the languages are treated as a target audience language hint). For controlled environments, there are a multitude of options available to authors, such as the HTTP header of the same name, <meta name> with custom names, microdata, RDFa, out-of-band data, <script> blocks, etc. We don't need to use this mechanism for that purpose. Doing so would just confuse authors further. No rationale is given for this second change, so it is hard to evaluate what the benefit of making this change would be. Finally, it should be noted that the aforementioned other change proposal is self-contradictory. Making the second change (thus making the syntax of the pragma the same as its HTTP namesake) would make the rationale for the first change (that we should explain the differences between the syntax of the pragma and the HTTP header) incorrect. DETAILS Change nothing. IMPACT POSITIVE EFFECTS * Encourages authoring behaviour compatible with both legacy user agents and with conforming user agents. * Flags uses of the pragma in existing documents that are not being reliably processed in existing UAs. NEGATIVE EFFECTS * Flags uses of the pragma in existing documents that are harmless, such as "en,en-US". However, evidence suggests that use of the comma is pretty rare anyway: http://lists.w3.org/Archives/Public/public-html/2010Apr/0088.html CONFORMANCE CLASS CHANGES None. RISKS It's possible that there is confusion. However, it is easy to handle this at a future date when clear evidence of such confusion is found. REFERENCES Tests: http://www.hixie.ch/tests/adhoc/html/meta/content-language/ -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Sunday, 4 April 2010 01:02:22 UTC