W3C home > Mailing lists > Public > public-html@w3.org > October 2009

Re: what's the language of a document ?

From: Martin J. Dürst <duerst@it.aoyama.ac.jp>
Date: Fri, 30 Oct 2009 11:38:29 +0900
Message-ID: <4AEA51A5.3080801@it.aoyama.ac.jp>
To: CE Whitehead <cewcathar@hotmail.com>
CC: ishida@w3.org, ian@hixie.ch, simonp@opera.com, divya.manian@gmail.com, martin.kliehm@namics.com, cowan@ccil.org, public-html@w3.org, www-international@w3.org
On 2009/10/30 3:47, CE Whitehead wrote:
> I personally tend to agree with Roy Fielding, John Cowan, and Tex Texin actually, and not with Martin and Richard Ishida because I regulary create documents in two languages (French-English; French-Old French); following Richard Ishida's recommendations in "Specifying Languages in XHTML and HTML Content," I list all the languages in the meta content tag (when I have access to it; because my documents are generally served from a locale I don't control, I don't have access to the http headers).  I still set the html language to one or the other when possible and then if I get time specify additional information in relevant elements).

I'm sorry, but can you please explain where Richard and I differ from 
Roy/John/Tex? It could be that we have very minor differences of how we 
have expressed ourselves, but I think we all agree that HTML5 has to be 
changed to treat the Content-Language: HTTP response header and the 
corresponding <meta> "pragma" the same way.

> I think there will always be cases where people will not tag a document correctly; if a tag is needed it makes no sense to eliminate it because someone cannot yet use it properly.

I have to say that I slightly prefer ignoring multiple values in 
Content-Language: or the corresponding "pragma" to taking the first 
value for the default language, but that's a minor issue.

> And I think that Tex makes a point too--someone might specify a document language as fr-FR and fr-LU but not fr-CA and it makes no sense to default to unknown.

Well, there are thousands of cases where it's extremely easy for humans 
to say "well, the author probably must have meant 'foo'", but if you 
actually try to go through all the possibilities and make sure a 
computer can do it, then it very quickly becomes very difficult.

As for the "fr-FR and fr-LU but not fr-CA" example, using "fr" as a 
default may seem obvious to some, but then that would include "fr-CA", 
which the author actually didn't include. So just using "fr" would 
actually be wrong.

Regards,    Martin.


-- 
#-# Martin J. Dürst, Professor, Aoyama Gakuin University
#-# http://www.sw.it.aoyama.ac.jp   mailto:duerst@it.aoyama.ac.jp
Received on Friday, 30 October 2009 02:39:25 UTC

This archive was generated by hypermail 2.3.1 : Monday, 29 September 2014 09:39:09 UTC