W3C home > Mailing lists > Public > public-i18n-core@w3.org > January to March 2005

RE: Language tags on root

From: Richard Ishida <ishida@w3.org>
Date: Thu, 27 Jan 2005 20:32:25 -0000
To: "'Jukka K. Korpela'" <jkorpela@cs.tut.fi>, <public-i18n-core@w3.org>
Cc: "'i18n IG'" <w3c-i18n-ig@w3.org>, <www-html@w3.org>
Message-Id: <20050127203224.912214EF78@homer.w3.org>

If you are interested in this topic, please read the Working Draft that the
GEO WG has been developing at
http://www.w3.org/International/geo/html-tech/tech-lang.html

It makes the case that there are two different types of declaration involved
here:

the primary language (ie. metadata about the intended audience)

the text processing language (ie. information about the language of the
content)

and that these should not be munged together (although there may be room for
some assumptions where only one such declaration is present - I am working
on that text at the moment).


RI


============
Richard Ishida
W3C

contact info:
http://www.w3.org/People/Ishida/ 

W3C Internationalization:
http://www.w3.org/International/ 

Publication blog:
http://people.w3.org/rishida/blog/
 
 

> -----Original Message-----
> From: public-i18n-core-request@w3.org 
> [mailto:public-i18n-core-request@w3.org] On Behalf Of Jukka K. Korpela
> Sent: 27 January 2005 16:50
> To: public-i18n-core@w3.org
> Cc: i18n IG; www-html@w3.org
> Subject: RE: Language tags on root
> 
> 
> On Thu, 27 Jan 2005, Misha Wolf wrote:
> 
> > I like the idea of specifying the primary language on the root, and 
> > then specifying other languages as they occur in the tree.
> 
> That's a very natural approach, so natural that it's 
> difficult to suggest any alternative. Even in a bilingual 
> dictionary, one of the languages can usually be classified as 
> primary (by its use in explanations, for example). If two (or 
> more) languages are used in a completely balanced way, one 
> just needs to pick up one of them.
> 
> > The code for multiple languages is "mul":
> >    http://www.loc.gov/standards/iso639-2/englangn.html#mn
> 
> That might be suitable for characterizing documents in some 
> situations, but the language of an HTML document can be 
> described in much more detail.
> 
> > The appropriate way to provide a list of languages is to use the 
> > appropriate HTTP header and/or the <meta> element.  HTML
> > allows:
> >
> >    <meta http-equiv="Content-Language" content="fr, de, en">
> 
> HTML per se allows <meta http-equiv="foo" content="bar"> as well.
> 
> But the confusion around Content-Language is interesting. 
> Some authoring tools seem to generate meta tags with it, 
> instead of generating lang or xml:lang attributes. But what's 
> it for? If you read its description in the HTTP protocol 
> carefully, you'll notice a difference. It does not actually 
> specify the content language. Instead, it specifies the 
> language of the intended audience. It's difficult to say what 
> the difference is in practice, but they are two different things.
> 
> Moreover, Content-Language: fr, de, en would mean (by the 
> protocol) that the document is intended for people who know 
> French or German or English.
> Most naturally, this would mean that the document contains 
> the same content in all of those languages. More liberally, 
> one might say that it is sufficient that _some_ of the 
> content is understandable to people who know French, etc.
> 
> --
> Jukka "Yucca" Korpela, http://www.cs.tut.fi/~jkorpela/
> 
> 
Received on Thursday, 27 January 2005 20:32:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Wednesday, 1 October 2008 10:18:49 GMT