RE: Comment on LTLI WD

Some comments follow (irrelevant text deleted).


Addison Phillips
Internationalization Architect - Yahoo! Inc.

Internationalization is an architecture.
It is not a feature. 
> -----Original Message-----
> From: Mark Davis []
> Sent: 2006年5月2日 10:26
> To: Felix Sasaki
> Cc: Addison Phillips;;
> Subject: Re: Comment on LTLI WD
> >
> This overstates the issue. There is no a danger using a language tag for
> locale identification. 

The word "danger" is inappropriate here. There is a need, however, to indicate that locales are inferred from language tags and that the two are not interchangeable. In particular, locales are platform or environment specific and the results of inferring the locale from a language tag may differ depending on your specific implementation.

> The danger is in presuming that the region code
> in the language tag is a reliable indication of the physical location or
> governing policies for the user. 

That is *one* danger---and one that is inherent to both language tags and locale identifiers.

> There is also the issue of whether this
> document is to give workable recommendations, or only survey the field.
> I find the former more useful.

This document is on the REC-track, that is, it was originally intended to be normative and to normatively define how to exchange both language and locale information on the Web. If it ends up as a "survey of the field" I would personally regard it as a failure.
> Here is a suggested reformulation, drawing on Addison's message of 4/27.
> =>
> The notion of a locale is a computing concept, not a real world object.
> The actual definition depends entirely on the operating environment,
> programming language, and application's requirements. However, virtually
> all specifications of locale identifiers share some core features, and
> allow for the creation of functional, interoperable applications.

For the last sentence, I would suggest instead:
Locale identifiers usually share certain core features related to natural language and country/region. This specification defines locale identifiers which specific locale implementations can map to their proprietary features in order to create functional, interoperable applications.
> For locale identifiers it
> is common (and recommended) to allow either "_" or "-" as subtag
> delimiters on input, and canonicalize to "_" for uniqueness on output.
> When extracting a language identifier from a locale identifier, any "_"
> separators must be converted to "-", and any extensions need to be either
> removed or encapsulated as extensions (such as with "x-" syntax).

I don't like this. This document should require the use of the 3066bis hyphen. Locale identifiers internal to a specific implementation can map to underscore.
> There is one area with a significant semantic difference between locale
> and language identifiers. In locale identifiers, the region code is often
> presumed to be a indication of the physical location or governing policies
> for the user; this is not the case for language identifiers, where the
> region is used only to discriminate regional variants in language usage.
> Thus some degree of caution should be used when heuristically using
> language identifiers as locale identifiers.

I think this is a bit too strongly worded. I'd suggest:

One difference between language tags and locale identifiers is the meaning of the region code. In both language tags and locales, the region code indicates variation in language (as with regional dialects) or presentation and format (such as number or date formats). In a locale, the region code is also sometimes used to indicate the physical location, market, legal, or other governing policies for the user. 

Finally, this graf (that Mark quotes from your suggested text) is not appropriate to a REC that defines locale identifiers:

> There is not yet an Internet standard for locale identifiers. However, 
> there is one for natural language identifiers, [RFC 3066bis]. Since 
> these language identifiers can imply a locale and in the absence of a 
> standard for locale interchange, language identifiers are often used 
> by software as the source for locale identification. Language and 
> locale are distinct properties and should not be used interchangeably, 
> but there is a relationship between these parameters in the area of 
> resource selection and localization.

You'd be better to say:

This document defines locale identifiers for use in Web technologies. Historically, natural language identifiers [RFC 3066bis] have been used to infer locales, and, in the absence of a standard for locale interchange, were often used by software as the source for locale identification. 

Received on Tuesday, 2 May 2006 19:20:25 UTC