- From: Ian Hickson <ian@hixie.ch>
- Date: Fri, 9 Apr 2010 00:00:39 +0000 (UTC)
- To: public-html@w3.org
ISSUE-88 ======== SUMMARY There is no problem and the proposed remedy is to change nothing. RATIONALE There is no problem. Another change proposal suggests adding a note on the basis that we should clarify why the HTTP and pragma declarations are different to the lang="" attribute when it comes to values, and how they should be used, suggesting that this is a constant source of confusion. However, the HTML5 specification already goes to some lengths to alleviate this confusion, for example by strongly discouraging the use of the pragma, encouraging lang="" use instead, and explicitly requiring that conformance checkers warn of this issue where relevant. It isn't clear that the suggested note would actually do anything further to reduce the confusion. If anything, it might make matters worse, by offering an (incorrect) rationale for using the pragma. The pragma doesn't give metadata about the document. The original intent of the <meta http-equiv> feature was to provide a way for _servers_ to include data in their HTTP headers on a per-file basis; this isn't document-wide metadata for user agents, it's for servers. This original intent also doesn't match reality; reality is that this pragma sets the default language for lang="", which also isn't document-wide metadata for user agents. The same change proposal also suggests a second change, namely to change the syntax to allow multiple comma-separated language codes, even though providing multiple language codes like this would cause the entire pragma to be ignored. User agents vary in their handling of the Content-Language pragma. Some user agents support a comma-separated list as meaning (contrary to the intent of the Content-Language HTTP header) that the root element and its descendants, in the absence of any lang="" attribute, are in multiple languages. This seems to contradict the model expected by the :lang selector and by the lang="" attribute, which assume that each element has a single language. Other user agents treat the comma as part of the language tag, for example treating <meta http-equiv="Content-Language" content="en,fr"> as setting a pragma-set default language of "en,fr", which can be matched by a selector such as ":lang(en\,fr)", and specifically _not_ by ":lang(en)". (The specification's UA conformance criteria propose a compromise model wherein user agents ignore pragmas that specify multiple languages, acknowledging that they are multiple languages, but not making any one language have a higher priority than the others and not requiring that the user agents support the multi-language model, which would require significant effort for what is just a legacy feature at this point.) Because of the way some legacy UAs handle this pragma, and because the behaviour of conforming UAs drops pragmas with multiple languages, it would be ill advised for us to make multiple values conforming. The way to mark that a document _uses_ multiple languages in such a way that user agents can actually parse and find this information is to use the lang="" attribute in the document. Putting multiple values in the pragma would fail to handle this according to the proposal. Another possible use case would be to to have a standard way to say who the target audience of the document is, but in practice few people use that information on the Web, so it doesn't seem like having a pragma that exposes this information would be useful, even if we ignore that the user agents are currently required to ignore that information. Even if there was such a need, this feature would be a bad way to provide that information, since it is used in an incompatible way by user agents (they use this information to determine processing behaviour -- none of the languages are treated as a target audience language hint). For controlled environments, there are a multitude of options available to authors, such as the HTTP header of the same name, <meta name> with custom names, microdata, RDFa, out-of-band data, <script> blocks, etc. We don't need to use this mechanism for that purpose. Doing so would just confuse authors further. No rationale is given for this second change, so it is hard to evaluate what the benefit of making this change would be. DETAILS Change nothing. IMPACT POSITIVE EFFECTS * Encourages authoring behaviour compatible with both legacy user agents and with conforming user agents. * Flags uses of the pragma in existing documents that are not being reliably processed in existing UAs. NEGATIVE EFFECTS * Flags uses of the pragma in existing documents that are harmless, such as "en,en-US". However, evidence suggests that use of the comma is pretty rare anyway: http://lists.w3.org/Archives/Public/public-html/2010Apr/0088.html CONFORMANCE CLASS CHANGES None. RISKS Maybe allowing the pragma at all is not going far enough. REFERENCES Tests: http://www.hixie.ch/tests/adhoc/html/meta/content-language/ -- Ian Hickson U+1047E )\._.,--....,'``. fL http://ln.hixie.ch/ U+263A /, _.. \ _\ ;`._ ,. Things that are impossible just take longer. `._.-(,_..'--(,_..'`-.;.'
Received on Friday, 9 April 2010 00:01:14 UTC