- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Sat, 13 Mar 2010 06:01:52 +0100
- To: "Phillips, Addison" <addison@amazon.com>
- Cc: CE Whitehead <cewcathar@hotmail.com>, "www-international@w3.org" <www-international@w3.org>, "public-html@w3.org" <public-html@w3.org>, "ishida@w3.org" <ishida@w3.org>, "ian@hixie.ch" <ian@hixie.ch>
Leif Halvard Silli, Fri, 12 Mar 2010 22:43:47 +0100, replying Addison: [...] >> Ideally *all* documents will populate the root element >> with an appropriate @lang attribute (although, please note, that >> there exist cases in which an empty attribute *is* the appropriate >> value). > > There is a difference between an empty attribute and no attribute. But > only in XML: [1] [...] > But not in HTML. [...] > Then at least Firefox and Safari will treat the element as if nothing > has been declared, and thus apply the language inside <meta> C-L as a > fallback solution. I have Live DOM Viewer test that you can test for > this. [2] Whereas Internet Explorer 8, will use the XML behaviour. I'm very glad you mentioned the issue of empty lang="" ... It is a very significant and bug related use case for this issue! Let's look at Firefox: The only way to make sure that Firefox interprets an empty @lang in the root element in a manner which is similar to how an empty lang is interpreted in XML, is to *also* provide a single, *white-space filled* <meta> content-language element: The presence of that <meta> content-language element causes the user agent to not listen to the content-language header coming from the server. A whitespace filled <meta> content-language element validates as XHTML in the W3 Validator. But does not validate, currently, in HTML5. Also note that an empty lang="" *is* interpreted in the XML way even in Firefox *provided* that there isn't any content-language headers (whether from server or in the document) with one (or more) actual language tags inside. If you manage to remove the content-language header, or if you manage to silence it the way I described above (with an white-space filled <meta> content-language element), then an empty lang="" attribute *will* have the effect that the element doesn't inherit the language from its parent element. (But if you do not silence the content-language header like this, then the element will associate itself with *one* of the languages of the content-language header.) So, I would like to extend my not yet written change proposal to say that a white-space filled <meta> content-language element should be valid - like it is in XHTML. For Firefox, it is enough with just a single such element - that will silence the effect of the content-language header. (This was totally new to me ... On the surface, Safari has the same issue, but a more thorough look shows that Safari fails to respect the semantics of an empty lang="" regardless of whether there is a relevant content-language header or not.) (And clearly we should file bugs for Webkit and Mozilla w.r.t. how they behave w.r.t. empty lang="".) Finally: I think *both* Ian and the I18N group make the mistake of treating the <meta> element differently from the HTTP header. W.r.t. to the I18N group: Take the suggestion that the order of the language tags inside the <meta> content-language element should be significant. So what if there is no <meta> content-language element? But instead, there are several content-languages specified on the server? Do you want to regulate the order of the language tags coming from the server as well? (The tests that Richard has made shows that Firefox and IE listens to both - which is in line with HTML4.) I think that *only* if you really and truly edit the HTTP specification to say that the order matters, can we implement something like that for the <meta> content-language element. W.r.t. Ian: What I explained about the Firefox behaviour above, shows that restricting the <meta> content-language element to only one language tag fails to solve all the interoperability problems that this element has. And also: Currently Validator.nu asks authors to remove the <meta> content-language element and use @lang instead (even when it contains only one language tag). However, this advice only fools authors to think that they can get rid of the effect of the content-language header by removing the <meta> content-language element. Which they can't! Unless they in the same go also removes the content-language headers that the server sends out. -- leif halvard silli
Received on Saturday, 13 March 2010 05:02:27 UTC