W3C home > Mailing lists > Public > www-international@w3.org > January to March 2013

Re: Language ranges with more than two sub-tag

From: Marcos Caceres <w3c@marcosc.com>
Date: Fri, 1 Mar 2013 17:37:46 +0000
To: Phillips, Addison <addison@lab126.com>
Cc: "www-international@w3.org" <www-international@w3.org>
Message-ID: <E5892D613E1A47A185BC5EE78C2596F3@marcosc.com>

Hi Addison,  

On Friday, 1 March 2013 at 16:59, Phillips, Addison wrote:

> Hello Marcos,
> 
> First the question (which, although Martin has provided some answers, I want to reiterate): language tags (and thus language ranges) with more than two subtags are not uncommon. BCP 47 was designed to allow for a variety of different uses for subtags, not the least of which are script subtags such as the ones Martin cited in his reply. In addition to Chinese, several other languages have script variations, with different kinds of script variation. For example, Serbian can be written in either Latin or Cyrillic script.
Right. However, the problem I'm having is that I can't see any browser that makes use of language tags with more than 2 sub tags by default :( The same seems to apply to, at least, MacOS. No matter what I set, I only get language ranges with two in the browser. I don't claim to have any expertise in this area (I'm probably just doing it wrong), so I'm looking for some hard data.

I don't know if anyone here can help me, but what I'd really like to find is data that shows what Accept-Language: values are being transferred over the wire. I know that this will not be completely representative [x]. But, if I can show that at least some people are, by default or not, being excluded, it could weight heavily towards swaying browser vendors.  

[x] http://www.w3.org/International/questions/qa-accept-lang-locales

> If Firefox OS works as described, it would create problems with organizing localized materials for languages that use multiple scripts. There can only be one set of resources, for example, that inhabit "zh"---they can be either Traditional Chinese or Simplified Chinese, but cannot effectively be both.
Correct.  
> Another use of additional subtags are for variants and a number of variants have been registered for specific purposes in the last few years. There are other articles about language tag choice on the W3C-I18N site. See [1][2] (and I'm sure you can find more).
> 
> But the biggest contributors to additional subtags are the two extensions that have been created. One is for transliterations and transformations (which may be of interest to an application).
> The other is for locale identifiers (which is obviously of interest to an application!). In addition, JavaScript itself now has a locale model (which includes locale negotiation) and I would most definitely recommend that you look closely at it. It incorporates the locale extension. See Norbert's note on this list for a link to the most recent version: [3]. It would make the most sense for locale-selection and language-negotiation to work in lockstep, especially as browser vendors are working on implementations.

Agreed. It would certainly make sense to align where possible. However, I'll need to ask Norbert for guidance on this, as I haven't fully groked [3] yet. 

Kind regards,
Marcos 
Received on Friday, 1 March 2013 17:38:17 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Friday, 1 March 2013 17:38:17 GMT