Re: [Moderator Action] Re: Official ISO 3166 country codes online

Forwarded by the list moderator.

At 02:25 1999/11/28 -0500, Michael Kaplan wrote:
> Well, with it being forward to at least two aliasea *all* of whose questions over the last several
> months have been on localization, I'll stand by the statement that its a shortcoming of this
> particular list. :-)
> 
> Obviously I do not think its a shortcoming of country codes, since in the next paragraph I point out
> a source that includes this info that *is* useful for localization purposes.
> 
> AFAIK there is no info in the website that is not in the link I provided, and there is MUCH info
> provided in the one I gave that is not on this "official" web site.
> 
> michka
> 
> 
> ----- Original Message -----
> From: Sean M. Burke <sburke@netadventure.net>
> To: Michael Kaplan <michka@trigeminal.com>
> Cc: Misha Wolf <misha.wolf@reuters.com>; www international <www-international@w3.org>; IETF
> Languages <ietf-languages@apps.ietf.org>; Unicode Discussion <unicode@unicode.org>; ne loc sig
> <nelocsig@egroups.com>; i18n prog <i18n-prog@acoin.com>
> Sent: Saturday, November 27, 1999 6:27 PM
> Subject: Re: Official ISO 3166 country codes online
> 
> 
> > At 10:32 AM 1999-11-26 -0800, you wrote:
> > >The biggest downside to this list
> > [ apparently referring to
> >   http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1 ]
> > >is that it is only the region-specific codes, even though
> > >applications like Netscape and IE use the language and region
> > >when there is a need for clarity.
> >
> > That's like saying that the downside to ketchup is that it's not a
> > fissionable material.
> > The "downside" is not in any shortcoming of the product (country codes, or
> > ketchup), but in its unfitness for an unintended purpose (localization, or
> > nuclear fission, resp.).
> >
> > The fact that countries aren't the same thing as locales or languages is a
> > well-known problem; and it's why we have locale IDs and language codes.
> > Anyone who tries to represent a locale with a country code, or a country
> > with a language code, etc., is obviously misguided, and shouldn't be
> > localizing anything.
> >
> > >More importantly, in cases where you might not localize into
> > >(for example) every single region that speaks Arabic, you
> > >must deal with the fact that any of
> >  [...the language codes...]
> > >(ar, ar-ae, ar-bh, ar-dz,
> > >ar-eg, ar-iq, ar-jo, ar-kw, ar-lb, ar-ly, ar-ma, ar-om,
> > >ar-qa, ar-sa, ar-sy, ar-tn, ar-ye) may come back to you
> > >from a browser or elsewhere, yet almost all people localize
> > >into the "sa" version
> >  ...by which I assume you meant not "sa" but "ar-sa"...
> > >and so "ar-sa" or "ar" is what you
> > >will want to show.
> >
> > If a user agent requests an object while expressing a preference for
> > "ar-ma" (Moroccan Arabic), if the server sees that the closest thing it has
> > is an object tagged as being in "ar" ("generic" Arabic), yes, this would be
> > probably the best thing under the circumstances.
> >
> > However, answering an "ar-ma" request with an "ar-sa" (Saudi Arabic) object
> > seems decidedly less of a good idea; if the object in question is an audio
> > object, the Moroccan-speaker might find it quite unintelligible.
> >
> >
> > I'm aware of no single good solution to this problem -- particularly not
> > for Arabic or Chinese, where what "ar" or "zh" means varies so greatly
> > depending on the medium in question.
> >
> > However, having language-negotiation mechanisms interpret "ar" to mean
> > "Arabic in a dialect intelligible to the average international
> > speaker/reader of Arabic" does go a long way toward clarifying these
> > things.  What I personally do is that if a program I write receives an
> > object request with this list of languages (in decreasing order of
> > preference):
> >
> >   en-us, ar-kw, fr
> >
> > I impute it to mean:
> >   en-us, ar-kw, fr,    en, ar
> > I.e., the "generic international" codes are appended to the end, for each
> > more specific code specified in the preferences list.  (This violates
> > RFC1766's rule that language-tags should be considered atomic, but I use it
> > just as a fallback and a heuristic.)
> >
> > Granted, that means that if the object requested is available in forms
> > tagged as being in "fr", "en", and "ar", the user will get the "fr"
> > version.  This is passable, if potentially suboptimal.
> >
> > Moreoever, it comes about only because of two problems:
> > 1) The server's resources are not labelled right.  The English version
> > should be marked as being in whatever dialect it's in, in addition to the
> > fact it's in a form of English intelligible to the notional "average
> > international English-speaker/reader".
> > Ditto for the Arabic version.
> > 2) The user should specify his preferences, in order, for
> >  such "international" variants.
> >
> > For example, the user who specifies
> >   en-us, ar-kw, fr
> > might mean this:
> >   en-us, en, ar-kw, fr, ar
> > or might mean this:
> >   en-us, en, ar-kw, ar, fr
> >
> > I'm not sure which is less realistic -- expecting users to configure their
> > user agents correctly, or expecting content providers to label things
> > correctly.
> > Presumably the former task could be simplified by having the installers for
> > user-agents give Americans a default Accept-Language of "en-US, en",
> > Mexicans a default Accept-Language of "es-MX, es", and so on; anyone
> > unhappy with these defaults would be welcome to edit them.  The defaults
> > for Arabic-language and Chinese-language versions of user-agents could
> > differ from country to country.  User-agents being correctly configured by
> > default would save the content-providers from having to jump thru hoops to
> > deal with the effects of misconfiguration.
> >
> > This all presumes the existence of a "generic/international" variants of
> > languages with many variants.  Unfortunately that's a notably problematic
> > assumption for Arabic and Chinese, to a degree that depends on the medium
> > of the object in question.
> >
> > In these specific and problematic cases, I'd suppose that implementors
> > could specially treat them by access to a table somewhere expressing the
> > extent to which the average speaker of ar-X would accept an object in ar-Y.
> > It's my guess that you'd need at least three tables for different media:
> > writing, audio, video (that is, video without writing -- unlike Chinese TV
> > shows I see that are in spoken Mandarin, but subtitled in written Chinese
> > for the benefit of people who can read Chinese, but can't understand spoken
> > Mandarin).
> > Moreoever, the concept of "average speaker of ar-X" may also be fishy, or
> > may change greatly over time.
> >
> > While IANA/ISO language-negotiation protocols do not (as far as I know)
> > currently see heavy and crucial use in negotiating the serving of variant
> > audio/video resources in Arabic or Chinese, one never knows what tomorrow
> > may bring.  I suppose the hard part is in not overcomplicating the
> > protocols for everyone else merely to accomodate content-negotiation of
> > Chinese and Arabic.
> >
> > --
> > Sean M. Burke sburke@netadventure.net http://www.netadventure.net/~sburke/
> >
> > /* the i18n-prog homepage is at:               */
> > /* http://www.acoin.com/i18n/i18n-prog.htm     */
> > /* See the page for removal instructions, etc. */
> >
> 
> 
> 


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org

Received on Sunday, 28 November 1999 03:30:34 UTC