- From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no>
- Date: Sat, 20 Mar 2010 06:11:28 +0100
- To: CE Whitehead <cewcathar@hotmail.com>
- Cc: xn--mlform-iua@xn--mlform-iua.no, ian@hixie.ch, addison@amazon.com, www-international@w3.org, public-html@w3.org, ishida@w3.org
CE Whitehead, Fri, 19 Mar 2010 14:17:59 -0400: > Hi, Leif, all. > > I agree with Leif that, for handling of multiple meta elements, > the w3c can retroactively align its standards to practice, > by making the last meta element the one that is valid, > in the case that multiple are specified > (I believe that this is more in line with the w3c's standards for > style code anyway; the last style declaration that applies to an element is the > one that is processed; all others are ignored, right?). CSS style rules are a good parallel. But even language inheritance itself: it starts with the inner element, and moves to the outer element. And from the perspective of let's say a <p> element, then a HTTP-EQUIV <meta> content-language element comes after the <html> element. I think that Ian, even though he claims to treat the <meta> content-language not as an element which imitates the HTTP header anymore, still treats it as such when he insists that user agents should look at the *first* <meta> content-language element. > * * * Below is some disagreement sorry for it maybe I am wrong * * * > > My two cents on having two meta-elements: if one of these is > omitted, which processes will the remaining element be used for? > (This has to be specified.) I will see if I can make it clearer. However, if there is only one <meta> content-language element, then this element is both the first and the last, at once. ;-) Thus user agents will use it for setting the language. But web servers will also use the same element. If there are two, then web servers should use the first, while user agents use the last. > Also, we still have the html lang= and xml lang= elements/attributes > in any case! Yes. > So you suggest the html and xml lang attributes plus two meta > elements plus an http header? Always use (xml:)lang="". But whether - and how - you should use the <meta> content-language attribute depends. First: decide if you need to use the HTTP content-language header at all. If we assume that you should, then the question what means to use. The first reason to not use the <meta> c-l element for this mean, is that most of us do not have access to web servers/CMS-es that actually make any use of the content-language element. And hence, there is no technical goodness in it for most of us. So then why use it? It is better to use a method that actually works. Apache can easily be configured send out the content-language header. Apache doesn't make use of <meta> c-l for this functionality. So, in the usual scenario, then <meta> content-language is not necessary to use. In these cases users should either not use it at all, or they should use a *single* white-space filled <meta> c-l element for the purpose of cancelling the unwanted language fallback effect in Mozilla and Webkit/Konqueror/Chrome. The latter option (a single white-space filled <meta c-l) is my view the most optimal use in most situations. (Of course, cancelling the language fallback effect is only meaningful if you also use @lang.) If authors decide to use it as originally intended (with one or several language tags inside), and if they also only want to use only a single element, then authors are better off if they can find a reason to validly fill it with more than one language tag, because then the language fallback effect already cancels itself in all browsers except Mozilla. I will not, in these cases, *require* authors to also use a second white-space filled <meta> c-l element. But authors should be aware that this is the only way to cancel the effect in Mozilla browsers as well. > Also, as the meta is only a fallback for when those are not specified > I am not sure we need two anyway We cannot use "we" about this. What one needs to do depends on the level of control that one needs to have. If your web server sends out a content-language header (which is not unlikely that it does), then both Mozilla browsers and IE8 will use that header as fallback language. Of these two, only Mozilla has the language inheritance problem. And if that problem is important to you to solve, then the only way to get rid of it, is to use a last (which could also be a single) <meta> content-language element with white-space inside. > (I need convincing; to me this is a case where aligning w3c > specifications to the current practice -- > using the first language specified by http or meta content-language > to populate the lang= attribute in the html or xml tag, as has been > discussed previously -- makes sense). My main rationale is this: Given how messy this whole issue now is, then it seems to be very complicated to get user agents to actually move in the direction of making one of the languages the default one. I simply think it is too much to ask for. It seems better to focus on getting the processing language right first (by ensuring that it is possible for authors to legally cancel the effect of the problematic language fallback story of the <meta> content-langauge element). Rather than complicating the issue with requests about making <meta> content-language containing several languages do things that it was not meant to do. > Finally, will someone who ignores the html or xml lang = > successfully use the two meta elements? Keeping data in the right > order? Firstly: There are no common user agents that ignores the lang= attribute, I think. But there are a few (Mozilla, Webkit/Konqueror/Chrome) which fail to treat an *empty* lang (<html lang="">) according to how HTML5 wants it to be. Secondly: Since I discuss a problem which is related to the situation *when* the author uses the lang="" attribute, your question is not really related to the issue at hand. But I will answer you anyhow: 1) If the document doesn't contain a single lang attribute, but still contains two <meta> content-language elements, where the last one contains white-space, then user agents would not receive language information from anywhere - they would not have any clue about the language. The same "problem" would also arise if <meta> content-langauge contains more than one language. (However, since <meta> c-l is not meant to define the processing language, this can't really be seen as a problem.) 2) But, if the last <meta> content-language element of this hypothetical document *does* contain a single language code, then all browsers that actually make use of the <meta> content-language element, would pick it up and use it as the language of the document. (I have tested IE8, Firefox, Webkit/Konqueror/Chrome. None of the Opera versions I tested made any use of <meta> content-language.) > My personal opinion is that they (he, she, whoever) can just as well > learn to use the html / xml lang attributes as they (he, she, > whoever) can learn to insert an additional meta element. There is no "just as well". The fact of the mater is that we have four web browsers that fail to respect the semantics of an empty lang="" attribute as soon as a <meta> c-l element comest into the picture. In particular Mozilla. If you say <p lang="">, then Mozilla will respect this and treat it as an element for which the language is unknown. *Except* when there is a Content-Language coming from the last <meta> c-l element *or* (when there is no <meta> element) coming from the server. Thus, as you can see, if we want to solve the Mozilla problem, then we must make sure that last (or the single) <meta> c-l element is white-space filled. (The <meta> is thus both be the problem and the solution ...) And again: one or two <meta> c-l elements? This depends. See above. > (I am not always for aligning w3c specifications to current practice: > I still want a way to specify two document or audience languages Sorry, I forgot: did you by "document or audience languages" mean "text processing language and audience language"? > where content is truly mixed, but not two meta elements. Truly mixed? In a hierarchic tree structure like HTML, then there is no way to say that the content is "truly mixed". (Not until we get language tags that are able to express this, at least.) If you write <html lang="en"> then you say that the <html> element contains English. It is impossible to say that it contains English and French, for instance. And the <meta> content-language also doesn't say that your document contains English and French just because you write content="en, fr" inside it. My proposal doesn't take away or add any functionality in this regard. > The http headers and the meta elements have been the designated > places for this. If you can pinpoint a place in my change proposal were I change the semantics of @lang or the <meta> content-language, then I would immediately correct it. > I need more explanation, I guess; but I don't think I would support > two meta content-language elements. Then you should go back to the I18N group and ask them to change their proposal. Their proposal do not forbid two <meta> content-langauge elements. HTML4 also doesn't forbid it. And not other HTML specification (including HTML5) that I am aware of. T he only new thing in my proposal is that I suggest that we specify that the last meta element is the one that counts with regard to language inheritance. This is the opposite of what HTML5 currently says, but in line with how all browsers behave. This solution, together with the permission to use white-space inside it (as was permitted in HTML4/XHTML1) will BOTH solve the default language problem AND solve the language inheritance problem. [Of course, it cannot solve both problems at the same time - but it can solve the one of these two that the author in question is most concerned about solving. A full solution to the problem requires that Mozilla browsers, Chrome, Webkit and Konqueror solve some bugs.] [...] -- Leif Halvard Silli
Received on Saturday, 20 March 2010 05:12:08 UTC