- From: Martin J. Duerst <duerst@w3.org>
- Date: Wed, 01 Dec 1999 06:13:46 +0900
- To: "Michael Kaplan" <michka@trigeminal.com>
- Cc: "www international" <www-international@w3.org>
Forwarded. At 07:33 1999/11/29 -0500, Michael Kaplan wrote: > References: <199911261818.LAA07338@acoin.com> <3.0.6.32.19991127192722.007efcb0@stonehenge.netadventure.net> > Date: Sat, 27 Nov 1999 23:21:42 -0800 > X-Priority: 3 > X-MSMail-Priority: Normal > X-Mailer: Microsoft Outlook Express 5.00.2314.1300 > X-Mimeole: Produced By Microsoft MimeOLE V5.00.2314.1300 > Mailing-List: contact nelocsig-owner@egroups.com > X-Mailing-List: nelocsig@egroups.com > Precedence: bulk > List-Help: <http://www.egroups.com/group/nelocsig/info.html>, > <mailto:nelocsig-help@egroups.com> > List-Unsubscribe: <mailto:nelocsig-unsubscribe@egroups.com> > List-Archive: <http://www.egroups.com/group/nelocsig/> > X-eGroups-Approved-By: stopping@rochester.rr.com via webctrl > Reply-To: nelocsig@egroups.com > Subject: [nelocsig] Re: Official ISO 3166 country codes online > MIME-Version: 1.0 > Content-Type: text/plain; charset="iso-8859-1" > Content-Transfer-Encoding: 7bit > X-RCPT-TO: <sbass@altrans.com> > X-UIDL: 532 > Status: U > > Well, with it being forward to at least two aliasea *all* of whose questions over the last several > months have been on localization, I'll stand by the statement that its a shortcoming of this > particular list. :-) > > Obviously I do not think its a shortcoming of country codes, since in the next paragraph I point out > a source that includes this info that *is* useful for localization purposes. > > AFAIK there is no info in the website that is not in the link I provided, and there is MUCH info > provided in the one I gave that is not on this "official" web site. > > michka > > > ----- Original Message ----- > From: Sean M. Burke <sburke@netadventure.net> > To: Michael Kaplan <michka@trigeminal.com> > Cc: Misha Wolf <misha.wolf@reuters.com>; www international <www-international@w3.org>; IETF > Languages <ietf-languages@apps.ietf.org>; Unicode Discussion <unicode@unicode.org>; ne loc sig > <nelocsig@egroups.com>; i18n prog <i18n-prog@acoin.com> > Sent: Saturday, November 27, 1999 6:27 PM > Subject: Re: Official ISO 3166 country codes online > > > > At 10:32 AM 1999-11-26 -0800, you wrote: > > >The biggest downside to this list > > [ apparently referring to > > http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1 ] > > >is that it is only the region-specific codes, even though > > >applications like Netscape and IE use the language and region > > >when there is a need for clarity. > > > > That's like saying that the downside to ketchup is that it's not a > > fissionable material. > > The "downside" is not in any shortcoming of the product (country codes, or > > ketchup), but in its unfitness for an unintended purpose (localization, or > > nuclear fission, resp.). > > > > The fact that countries aren't the same thing as locales or languages is a > > well-known problem; and it's why we have locale IDs and language codes. > > Anyone who tries to represent a locale with a country code, or a country > > with a language code, etc., is obviously misguided, and shouldn't be > > localizing anything. > > > > >More importantly, in cases where you might not localize into > > >(for example) every single region that speaks Arabic, you > > >must deal with the fact that any of > > [...the language codes...] > > >(ar, ar-ae, ar-bh, ar-dz, > > >ar-eg, ar-iq, ar-jo, ar-kw, ar-lb, ar-ly, ar-ma, ar-om, > > >ar-qa, ar-sa, ar-sy, ar-tn, ar-ye) may come back to you > > >from a browser or elsewhere, yet almost all people localize > > >into the "sa" version > > ...by which I assume you meant not "sa" but "ar-sa"... > > >and so "ar-sa" or "ar" is what you > > >will want to show. > > > > If a user agent requests an object while expressing a preference for > > "ar-ma" (Moroccan Arabic), if the server sees that the closest thing it has > > is an object tagged as being in "ar" ("generic" Arabic), yes, this would be > > probably the best thing under the circumstances. > > > > However, answering an "ar-ma" request with an "ar-sa" (Saudi Arabic) object > > seems decidedly less of a good idea; if the object in question is an audio > > object, the Moroccan-speaker might find it quite unintelligible. > > > > > > I'm aware of no single good solution to this problem -- particularly not > > for Arabic or Chinese, where what "ar" or "zh" means varies so greatly > > depending on the medium in question. > > > > However, having language-negotiation mechanisms interpret "ar" to mean > > "Arabic in a dialect intelligible to the average international > > speaker/reader of Arabic" does go a long way toward clarifying these > > things. What I personally do is that if a program I write receives an > > object request with this list of languages (in decreasing order of > > preference): > > > > en-us, ar-kw, fr > > > > I impute it to mean: > > en-us, ar-kw, fr, en, ar > > I.e., the "generic international" codes are appended to the end, for each > > more specific code specified in the preferences list. (This violates > > RFC1766's rule that language-tags should be considered atomic, but I use it > > just as a fallback and a heuristic.) > > > > Granted, that means that if the object requested is available in forms > > tagged as being in "fr", "en", and "ar", the user will get the "fr" > > version. This is passable, if potentially suboptimal. > > > > Moreoever, it comes about only because of two problems: > > 1) The server's resources are not labelled right. The English version > > should be marked as being in whatever dialect it's in, in addition to the > > fact it's in a form of English intelligible to the notional "average > > international English-speaker/reader". > > Ditto for the Arabic version. > > 2) The user should specify his preferences, in order, for > > such "international" variants. > > > > For example, the user who specifies > > en-us, ar-kw, fr > > might mean this: > > en-us, en, ar-kw, fr, ar > > or might mean this: > > en-us, en, ar-kw, ar, fr > > > > I'm not sure which is less realistic -- expecting users to configure their > > user agents correctly, or expecting content providers to label things > > correctly. > > Presumably the former task could be simplified by having the installers for > > user-agents give Americans a default Accept-Language of "en-US, en", > > Mexicans a default Accept-Language of "es-MX, es", and so on; anyone > > unhappy with these defaults would be welcome to edit them. The defaults > > for Arabic-language and Chinese-language versions of user-agents could > > differ from country to country. User-agents being correctly configured by > > default would save the content-providers from having to jump thru hoops to > > deal with the effects of misconfiguration. > > > > This all presumes the existence of a "generic/international" variants of > > languages with many variants. Unfortunately that's a notably problematic > > assumption for Arabic and Chinese, to a degree that depends on the medium > > of the object in question. > > > > In these specific and problematic cases, I'd suppose that implementors > > could specially treat them by access to a table somewhere expressing the > > extent to which the average speaker of ar-X would accept an object in ar-Y. > > It's my guess that you'd need at least three tables for different media: > > writing, audio, video (that is, video without writing -- unlike Chinese TV > > shows I see that are in spoken Mandarin, but subtitled in written Chinese > > for the benefit of people who can read Chinese, but can't understand spoken > > Mandarin). > > Moreoever, the concept of "average speaker of ar-X" may also be fishy, or > > may change greatly over time. > > > > While IANA/ISO language-negotiation protocols do not (as far as I know) > > currently see heavy and crucial use in negotiating the serving of variant > > audio/video resources in Arabic or Chinese, one never knows what tomorrow > > may bring. I suppose the hard part is in not overcomplicating the > > protocols for everyone else merely to accomodate content-negotiation of > > Chinese and Arabic. > > > > -- > > Sean M. Burke sburke@netadventure.net http://www.netadventure.net/~sburke/ > > > > /* the i18n-prog homepage is at: */ > > /* http://www.acoin.com/i18n/i18n-prog.htm */ > > /* See the page for removal instructions, etc. */ > > > > > ------------------------------------------------------------------------ > -- Talk to your group with your own voice! > -- http://www.egroups.com/VoiceChatPage?listName=nelocsig&m=1 > > > > > #-#-# Martin J. Du"rst, World Wide Web Consortium #-#-# mailto:duerst@w3.org http://www.w3.org
Received on Tuesday, 30 November 1999 20:58:32 UTC