W3C home > Mailing lists > Public > www-international@w3.org > April to June 2008

Re: [Comment on WS-I18N WD]

From: Frank Ellermann <nobody@xyzzy.claranet.de>
Date: Sat, 14 Jun 2008 04:07:14 +0200
To: www-international@w3.org
Message-ID: <g2v90s$q8t$1@ger.gmane.org>

Phillips, Addison wrote:

>> For locales names in the language_territory format "_" is
>> AFAIK the standard, compare chapter 8.2 in IEEE Std 1003.1
 
> For POSIX, sure.

That is what "locale" stands for.  Like "language tag" is what 
RFC 1766 and its successors say, and where we'd use "-".  The
OP wrote:

| Here is a list of items that we think are common:
|  1. Locale (already defined)
|  2. Timezone (already defined)
|  3. Language (used when UI language is different from the
| language deduced from the UI locale. e.g. "de" for German
| language, "fr-CH" for Switzerland/French locale)
|  4. Collation (based on the IANA collation registry)
[...}

Maybe he confused the terminology, he needs "language tags"
in (3), and fr-CH is a "language tag".  In point (4) ff. he
mentions some IANA registries, he could also do this in (3).

But (1) is apparently about locales, not about the language
tags covered in (3).  So in (1) we'd say fr_CH, not fr-CH.

That is an important difference, locales come with various
settings down to currency symbols, but there are not many
to pick from.  OTOH language tags are only about languages
and maybe scripts, and there are lots of valid no-nonsense
combinations.  

> there are other locale systems where this isn't the case
> or for which the separator is indeterminate. There is *no*
> definition of 'locale' for the Web and/or Internet

Well, when I look at the CLDR pages they use unsurprisingly
"_", not "-".  That's arguably two standards, POSIX and CLDR.

> There is no particular reason to use POSIX locales on the
> Internet and there is some history of abusing BCP 47 for
> the purpose already.

Disagree, I see no reason to "abuse" the IANA language subtag 
registry for something it is not, a locale registry, because
there is already a CLDR with different goals.  

> If we allow underscore is may actually be harmful, since it
> may promote the possibly-erroneous assumption that we mean
> POSIX locales.

Or CLDR locales.  It's a rather useful difference, "i-default" 
is no locale, and "C" is no human language.  With "en_GB" I'd
get an odd (from my POV) date format, with "en_US" I lose the
metric system, get alien temperatures, and a currency backed
by hot air.  Which isn't my plan when I say "en-GB" or "en-US".

 Frank
Received on Saturday, 14 June 2008 02:06:01 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:17 GMT