- From: Richard Ishida <ishida@w3.org>
- Date: Tue, 30 Mar 2010 17:21:23 +0100
- To: "'Leif Halvard Silli'" <xn--mlform-iua@xn--mlform-iua.no>, "'CE Whitehead'" <cewcathar@hotmail.com>
- Cc: <ian@hixie.ch>, <www-international@w3.org>, <public-html@w3.org>
> From: Leif Halvard Silli [mailto:xn--mlform-iua@målform.no] > Sent: 21 March 2010 16:28 > There are some XHTML document types which forbids the @lang attribute. > When you serve these document types as 'text/html', then all language > info is lost, as xml:lang="<whatever>" is not respected in 'text/html'. > For such documents, using <meta> content-language enables you to at > least define *one* language (for all elements) in a user agent > compatible way. Very soon this will no longer be the case. Changes to the remaining XHTML specs to allow them to be served as text/html are at an advanced stage, and include the addition of the lang attribute - since that is necessary for language information to be recognized in HTML. So this case ought not to be used for as a basis for proposed behaviour in HTML5. RI ============ Richard Ishida Internationalization Lead W3C (World Wide Web Consortium) http://www.w3.org/International/ http://rishida.net/ > -----Original Message----- > To: CE Whitehead > Cc: ian@hixie.ch; www-international@w3.org; public-html@w3.org; > ishida@w3.org > Subject: RE: ISSUE-88 / Re: what's the language of a document ? > > CE Whitehead, Sat, 20 Mar 2010 19:31:21 -0400: > > [ Snip. Reply to remaining part of letter: ] > > > RE: ISSUE-88 / Re: what's the language of a document ? > > From: Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> > > Date: Sat, 20 Mar 2010 06:11:28 +0100 > >> However, if there is only one <meta> content-language element, then > >> this element is both the first and the last, at once. ;-) Thus user > >> agents will use it for setting the language. But web servers will also > >> use the same element. If there are two, then web servers should use the > >> first while user agents use the last. > > But that's a "should." And user agents should use the html lang= or > > xml lang= too, right? And if they did we would not need two meta > > content-language elements, right? > > > But otherwise, the current text in HTML5 basically says basically the > same thing as you: We don't need to use <meta> content-language - use > lang="<*>" instead. This is true, in theory, but not in practise. Well, > it is true in practise as well. Except for the particular use case when > you explicitly want to set the language of an element to unknown (by > providing an empty lang attribute. (This use case is a result of the > fact that HTML5 aligns the meaning of an empty lang="" to the > same as the meaning of an empty xml:lang="".) For this particular use > case, it is necessary to make sure that the <meta> content-language > element says, in a user agent compatible way, that the language is > unknown. > > [...] > > Still I ask: why not simply ask the browsers to respect the html > > lang="" or xml lang="" declaration if they do not? > > HTML5 does ask that they respect an empty lang="" in HTML (or an empty > xml:lang in XHTML). Neither my change proposal nor the I18N WG's > proposal conflict with this. > > > Would the browsers be more inclined to process a second meta > > content-lang element set to lang="" > > than to respect the xml lang="" or html lang=""? > > That is my real question for you. > > (The attribute of the <meta> content-language element which contains > the language tag(s) isn't called 'lang', it is called 'content'.) > > Assuming that you meant 'empty lang=""' (and not 'any lang="", empty or > not'), then the answer to your question is that the problematic user > agents (Mozilla family + Konqueror/Webkit/Chrome family) respect an > empty <meta> content-language. Whereas the same two browser families, > without going into the (important!) details (again), don't (always) > respect an empty lang="". > > I don't have any particular to say about xml:lang="" - as I have only > tested 'text/html' (where it has no effect). > > [...] > > Yes, so you are saying that specifying multiple languages at this point > > is equivalent to specifying lang="" > > Richard has described the meaning of an empty xml:lang="" like this: [1] > > ]]XML also provides a means to prevent inheritance of language > using > the empty string, ie. xml:lang="". Essentially, this says: I do not > want to associate any language with this information.[[ > > HTML5 changes, I believe, an empty lang="" to have the same meaning. I > don't know if you, by the wording "specifying multiple languages", > meant the same as Richard. From one angle it can certainly be correct > to say that a truly multilingual document should not associate any > particular language with itself. > > [... some multiple language issues ...] > > (I will send you a sample if you wish -- in private email; I see no > > reason to clutter up the list.) > > Please do. > > [1] http://www.w3.org/International/articles/language-tags/#overview > -- > leif halvard silli > > No virus found in this incoming message. > Checked by AVG - www.avg.com > Version: 9.0.791 / Virus Database: 271.1.1/2760 - Release Date: 03/21/10 > 07:33:00
Received on Tuesday, 30 March 2010 16:22:01 UTC