W3C home > Mailing lists > Public > www-international@w3.org > October to December 1999

Re: [Moderator Action] [www-international] <none>

From: Martin J. Duerst <duerst@w3.org>
Date: Wed, 01 Dec 1999 06:13:46 +0900
Message-Id: <199912010157.KAA26783@sh.w3.mag.keio.ac.jp>
To: "Michael Kaplan" <michka@trigeminal.com>
Cc: "www international" <www-international@w3.org>
Forwarded.

At 07:33 1999/11/29 -0500, Michael Kaplan wrote:
> References: <199911261818.LAA07338@acoin.com> <3.0.6.32.19991127192722.007efcb0@stonehenge.netadventure.net>
> Date: Sat, 27 Nov 1999 23:21:42 -0800
> X-Priority: 3
> X-MSMail-Priority: Normal
> X-Mailer: Microsoft Outlook Express 5.00.2314.1300
> X-Mimeole: Produced By Microsoft MimeOLE V5.00.2314.1300
> Mailing-List: contact nelocsig-owner@egroups.com
> X-Mailing-List: nelocsig@egroups.com
> Precedence: bulk
> List-Help: <http://www.egroups.com/group/nelocsig/info.html>,
>  <mailto:nelocsig-help@egroups.com>
> List-Unsubscribe: <mailto:nelocsig-unsubscribe@egroups.com>
> List-Archive: <http://www.egroups.com/group/nelocsig/>
> X-eGroups-Approved-By: stopping@rochester.rr.com via webctrl
> Reply-To: nelocsig@egroups.com
> Subject: [nelocsig] Re: Official ISO 3166 country codes online
> MIME-Version: 1.0
> Content-Type: text/plain; charset="iso-8859-1"
> Content-Transfer-Encoding: 7bit
> X-RCPT-TO: <sbass@altrans.com>
> X-UIDL: 532
> Status: U
> 
> Well, with it being forward to at least two aliasea *all* of whose questions over the last several
> months have been on localization, I'll stand by the statement that its a shortcoming of this
> particular list. :-)
> 
> Obviously I do not think its a shortcoming of country codes, since in the next paragraph I point out
> a source that includes this info that *is* useful for localization purposes.
> 
> AFAIK there is no info in the website that is not in the link I provided, and there is MUCH info
> provided in the one I gave that is not on this "official" web site.
> 
> michka
> 
> 
> ----- Original Message -----
> From: Sean M. Burke <sburke@netadventure.net>
> To: Michael Kaplan <michka@trigeminal.com>
> Cc: Misha Wolf <misha.wolf@reuters.com>; www international <www-international@w3.org>; IETF
> Languages <ietf-languages@apps.ietf.org>; Unicode Discussion <unicode@unicode.org>; ne loc sig
> <nelocsig@egroups.com>; i18n prog <i18n-prog@acoin.com>
> Sent: Saturday, November 27, 1999 6:27 PM
> Subject: Re: Official ISO 3166 country codes online
> 
> 
> > At 10:32 AM 1999-11-26 -0800, you wrote:
> > >The biggest downside to this list
> > [ apparently referring to
> >   http://www.din.de/gremien/nas/nabd/iso3166ma/codlstp1 ]
> > >is that it is only the region-specific codes, even though
> > >applications like Netscape and IE use the language and region
> > >when there is a need for clarity.
> >
> > That's like saying that the downside to ketchup is that it's not a
> > fissionable material.
> > The "downside" is not in any shortcoming of the product (country codes, or
> > ketchup), but in its unfitness for an unintended purpose (localization, or
> > nuclear fission, resp.).
> >
> > The fact that countries aren't the same thing as locales or languages is a
> > well-known problem; and it's why we have locale IDs and language codes.
> > Anyone who tries to represent a locale with a country code, or a country
> > with a language code, etc., is obviously misguided, and shouldn't be
> > localizing anything.
> >
> > >More importantly, in cases where you might not localize into
> > >(for example) every single region that speaks Arabic, you
> > >must deal with the fact that any of
> >  [...the language codes...]
> > >(ar, ar-ae, ar-bh, ar-dz,
> > >ar-eg, ar-iq, ar-jo, ar-kw, ar-lb, ar-ly, ar-ma, ar-om,
> > >ar-qa, ar-sa, ar-sy, ar-tn, ar-ye) may come back to you
> > >from a browser or elsewhere, yet almost all people localize
> > >into the "sa" version
> >  ...by which I assume you meant not "sa" but "ar-sa"...
> > >and so "ar-sa" or "ar" is what you
> > >will want to show.
> >
> > If a user agent requests an object while expressing a preference for
> > "ar-ma" (Moroccan Arabic), if the server sees that the closest thing it has
> > is an object tagged as being in "ar" ("generic" Arabic), yes, this would be
> > probably the best thing under the circumstances.
> >
> > However, answering an "ar-ma" request with an "ar-sa" (Saudi Arabic) object
> > seems decidedly less of a good idea; if the object in question is an audio
> > object, the Moroccan-speaker might find it quite unintelligible.
> >
> >
> > I'm aware of no single good solution to this problem -- particularly not
> > for Arabic or Chinese, where what "ar" or "zh" means varies so greatly
> > depending on the medium in question.
> >
> > However, having language-negotiation mechanisms interpret "ar" to mean
> > "Arabic in a dialect intelligible to the average international
> > speaker/reader of Arabic" does go a long way toward clarifying these
> > things.  What I personally do is that if a program I write receives an
> > object request with this list of languages (in decreasing order of
> > preference):
> >
> >   en-us, ar-kw, fr
> >
> > I impute it to mean:
> >   en-us, ar-kw, fr,    en, ar
> > I.e., the "generic international" codes are appended to the end, for each
> > more specific code specified in the preferences list.  (This violates
> > RFC1766's rule that language-tags should be considered atomic, but I use it
> > just as a fallback and a heuristic.)
> >
> > Granted, that means that if the object requested is available in forms
> > tagged as being in "fr", "en", and "ar", the user will get the "fr"
> > version.  This is passable, if potentially suboptimal.
> >
> > Moreoever, it comes about only because of two problems:
> > 1) The server's resources are not labelled right.  The English version
> > should be marked as being in whatever dialect it's in, in addition to the
> > fact it's in a form of English intelligible to the notional "average
> > international English-speaker/reader".
> > Ditto for the Arabic version.
> > 2) The user should specify his preferences, in order, for
> >  such "international" variants.
> >
> > For example, the user who specifies
> >   en-us, ar-kw, fr
> > might mean this:
> >   en-us, en, ar-kw, fr, ar
> > or might mean this:
> >   en-us, en, ar-kw, ar, fr
> >
> > I'm not sure which is less realistic -- expecting users to configure their
> > user agents correctly, or expecting content providers to label things
> > correctly.
> > Presumably the former task could be simplified by having the installers for
> > user-agents give Americans a default Accept-Language of "en-US, en",
> > Mexicans a default Accept-Language of "es-MX, es", and so on; anyone
> > unhappy with these defaults would be welcome to edit them.  The defaults
> > for Arabic-language and Chinese-language versions of user-agents could
> > differ from country to country.  User-agents being correctly configured by
> > default would save the content-providers from having to jump thru hoops to
> > deal with the effects of misconfiguration.
> >
> > This all presumes the existence of a "generic/international" variants of
> > languages with many variants.  Unfortunately that's a notably problematic
> > assumption for Arabic and Chinese, to a degree that depends on the medium
> > of the object in question.
> >
> > In these specific and problematic cases, I'd suppose that implementors
> > could specially treat them by access to a table somewhere expressing the
> > extent to which the average speaker of ar-X would accept an object in ar-Y.
> > It's my guess that you'd need at least three tables for different media:
> > writing, audio, video (that is, video without writing -- unlike Chinese TV
> > shows I see that are in spoken Mandarin, but subtitled in written Chinese
> > for the benefit of people who can read Chinese, but can't understand spoken
> > Mandarin).
> > Moreoever, the concept of "average speaker of ar-X" may also be fishy, or
> > may change greatly over time.
> >
> > While IANA/ISO language-negotiation protocols do not (as far as I know)
> > currently see heavy and crucial use in negotiating the serving of variant
> > audio/video resources in Arabic or Chinese, one never knows what tomorrow
> > may bring.  I suppose the hard part is in not overcomplicating the
> > protocols for everyone else merely to accomodate content-negotiation of
> > Chinese and Arabic.
> >
> > --
> > Sean M. Burke sburke@netadventure.net http://www.netadventure.net/~sburke/
> >
> > /* the i18n-prog homepage is at:               */
> > /* http://www.acoin.com/i18n/i18n-prog.htm     */
> > /* See the page for removal instructions, etc. */
> >
> 
> 
> ------------------------------------------------------------------------
> -- Talk to your group with your own voice!
> -- http://www.egroups.com/VoiceChatPage?listName=nelocsig&m=1
> 
> 
> 
> 
> 


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org
Received on Tuesday, 30 November 1999 20:58:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:54 GMT