W3C home > Mailing lists > Public > www-international@w3.org > January to March 2010

RE: ISSUE-88 / Re: what's the language of a document ?

From: CE Whitehead <cewcathar@hotmail.com>
Date: Sat, 20 Mar 2010 19:31:21 -0400
Message-ID: <SNT142-w1791F11147417B823376A0B3290@phx.gbl>
To: <xn--mlform-iua@xn--mlform-iua.no>
CC: <ian@hixie.ch>, <www-international@w3.org>, <public-html@w3.org>, <ishida@w3.org>

Hi!
Thanks Leif for your reply.
I do already use the meta content-language element to override the http header my server sends --
as the meta content element is within my control, but the server settings are not
(my server certainly does send http headers out for my pages; I've checked on it; but it does not send the ones I want it to send).
I would suspect that surrounding a document's content with <div lang=""> would solve the problem with Mozilla where html="" or xml="" is ignored -- 
that seems to me to be a solution that the browsers can handle today.
Am I right at least in this regard?


Of course I'm not going to forbid two meta elements  --
I only objected to recommending the use of and processing of two meta content-language elements, 
and, alas I am still a bit confused in this regard:


in your opinion, would the mozilla browser be more inclined to use two meta elements where one has its content-language set to "" 
instead of xml lang="" or html lang=""??


That is my real question.

(More notes below.)

RE: ISSUE-88 / Re: what's the language of a document ?
From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> 
Date: Sat, 20 Mar 2010 06:11:28 +0100
>However, if there is only one <meta> content-language element, then 
> this element is both the first and the last, at once. ;-) Thus user 
> agents will use it for setting the language. But web servers will also 
> use the same element. If there are two, then web servers should use the 
> first while user agents use the last.
But that's a "should."  And user agents should use the html lang= or xml lang= too, right?
And if they did we would not need two meta content-language elements,

right?
> Firstly: There are no common user agents that ignores the lang= 
> attribute, I think. But there are a few (Mozilla, 
> Webkit/Konqueror/Chrome) which fail to treat an *empty* lang (<html 
> lang="">) according to how HTML5 wants it to be.
I see:  the html lang="" or xml lang="" will not override the server settings for Mozilla but will for IE.
> 1) If the document doesn't contain a single lang attribute, but still 
> contains two <meta> content-language elements, where the last one 
> contains white-space, then user agents would not receive language 
> information from anywhere - they would not have any clue about the 
> language. The same "problem" would also arise if <meta> 
> content-langauge contains more than one language. 
Yes, the user agents would not receive any language information. 

(But they will override the server settings I gather.) 
And if we suddenly specify new standards, saying that the first element also contains info, 
will the user agents then go get that info., because of the new standards,

but still ignore html lang=""? 
(A note:  it is also interesting to imagine someone's including two meta content-language elements
but no xml or html lang= attribute.)
Still I ask:  why not simply ask the browsers to respect the html lang="" or xml lang="" declaration if they do not?


Would the browsers be more inclined to process a second meta content-lang element

set to lang="" 

than to respect the xml lang="" or html lang=""?
That is my real question for you.

> (However, since 
> <meta> c-l is not meant to define the processing language, this can't 
> really be seen as a problem.)
Agreed.
> 2) But, if the last <meta> content-language element of this 
> hypothetical document *does* contain a single language code, then all 
> browsers that actually make use of the <meta> content-language element, 
> would pick it up and use it as the language of the document. (I have 
> tested IE8, Firefox, Webkit/Konqueror/Chrome. None of the Opera 
> versions I tested made any use of <meta> content-language.)
Yes, so you are saying that specifying multiple languages at this point
is equivalent to specifying lang=""
I understand that much; but I've got an html lang="" for the text processing language
so all I need is for my two audience languages to be specified; 
if they are ignored they are ignored;
I've at least got a text processing language -- and I do not set it to "" though you are right that
is probably what I should do for a truly bilingual document -- 
that is my problem --
as yes, you are right html lang= does not allow me to say my content is mixed 
even when it is side-by-side in two languages,
but that does not mean my content is not mixed.
(I will send you a sample if you wish  -- in private email; I see no reason to clutter up the list.)

If indeed browsers would prefer to process lang="" in a meta element
and browsers and such would not mind learning to process two meta elements 
then your proposal makes some sense.
 
I still feel that the problem can also be solved by having the document content enclosed by a div element
with lang="" as an attribute setting -- would this work for you?
 
Of course, I am not going to oppose your proposal I don't think.

> he only new thing in my proposal is that I suggest that we specify that 
> the last meta element is the one that counts with regard to language 
> inheritance. This is the opposite of what HTML5 currently says, but in 
> line with how all browsers behave. This solution, together with the 
> permission to use white-space inside it (as was permitted in 
> HTML4/XHTML1) will BOTH solve the default language problem AND solve 
> the language inheritance problem. [Of course, it cannot solve both 
> problems at the same time - but it can solve the one of these two that 
> the author in question is most concerned about solving. A full solution 
> to the problem requires that Mozilla browsers, Chrome, Webkit and 
> Konqueror solve some bugs.]
Thanks, agreed!  Absolutely that is the thing that needs solving.
And best wishes with this.
Best,
C. E. Whitehead
cewcathar@hotmail.com


 		 	   		  
Received on Saturday, 20 March 2010 23:32:06 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Saturday, 20 March 2010 23:32:10 GMT