Re: FW: Determining Locale in a Browser for Web 2.0 Applications

I agree with you that the problems may have nontechnical aspects.

I don't mean to blame Google in any way. As you point out, they are 
doing way beyond ordinary to provide good UI experience. I admire their 
UI and my primary browser is Chrome. My opinion is that it is a bit 
surprising if they determine UI language from location even if preferred 
language information is available.

Regards,
-Dan

Najib Tounsi wrote:
> Dan Chiba wrote:
>> Thank you Addison for the interesting topic and Najib for the 
>> excellent example.
>>
>> My opinion is that the advances in the i18n technologies for locale 
>> support such as BCP 47, CLDR, WS-I18N and LTLI are really great. 
>> Meanwhile, I think it is unfortunate that we still often see 
>> nonconforming behavior on the Web when these specifications are 
>> applicable to the use case. I also think it is obscure why Google 
>> uses location to choose the default UI language. It is a drop in the 
>> bucket, there is a large number of problems due to inadequate locale 
>> negotiation.
>
> I don't know if Google is to blame on that. It seems to me that it has 
> a quite subtle localization policy. I think that Google share a very 
> good intension to offer its interfaces in the native language of the 
> user, and it tries to do the best it could, trying to guess from 
> user's location and preferences.
>
> At the first launch of Google on my new computer, I was happy to see 
> the UI in Arabic. In addition, I am offered another opportunity in 
> French. The icing on the cake, when (against all!) I choose the 
> English interface, I am offered (or rather encouraged to go to) two 
> other UIs, Arabic and French, probably knowing that I speak both 
> languages. Morocco is also a francophone country.
>
> It is worth to note however that my Google English (may be generic) 
> interface is not the same as my (localised) French and Arabic 
> Interfaces. The former doesn't offer to search in locale 
> (Arabic/Francophone or Moroccan) pages.
>
> We may note also that, in Google's preference page (obtained by 
> http://www.google.com/preferences?hl=en) the languages in the list are 
> all written in the same language as the interface, and NOT in the 
> target language, as usually recommended by BPs. (Try different values 
> for the hl parameter in the URL above).
>
> I am not very familiar with the RFC specs and the state of their 
> implementation, but perhaps the problems of locale negotiation are not 
> only technical, "or question of implementation".
>
> Regards, Najib
>
>>
>> I suppose this unfortunate situation is because supporting 
>> implementation is not widely available. For example, Java is two 
>> generation obsolete from BCP 47 and trying to catch up in the locale 
>> enhancement project 
>> <http://sites.google.com/site/openjdklocale/Home>. I wish this 
>> project included more requirements, especially the matching part of 
>> BCP 47 - RFC 4646 and CLDR are covered but little of 4647. My FireFox 
>> doesn't allow me to enter a BCP 47 tag, while I can enter one in IE 
>> as a user defined tag.
>>
>> One of the most painful i18n problems in Java Web applications is 
>> neglecting locale negotiation. A developer may think taking the 
>> locale from the request object or JSF view locale and passing it to 
>> the ResourceBundle which implements a locale determination algorithm 
>> is all it takes. This model sometimes works fine to serve a small 
>> number of languages or locales, but it has a fundamental issue; there 
>> is no way the instances of the resource bundle to tell which locale 
>> to use, because they don't know the language preference of the user. 
>> For example, for an application to serve Najib, the appropriate 
>> algorithm to find the resource is looking for en first, fr next, then 
>> ar. This is as defined in RFC 4647. However nowhere in Java we can 
>> find this implemented. View locale determination of JSF is close to 
>> this, but its model is often ineffectual due to the fact that the 
>> negotiated locale cannot be used to choose both translation language 
>> and the locale used in other locale sensitive operations such as 
>> datetime and number formatting. For example, a conventional date 
>> format in Morocco may be preferred, even if the UI language is 
>> English, French or other foreign languages.
>>
>> And in the increasing number of scenarios where web content is 
>> generated in a remote service, we have a big question mark for the 
>> way the service determines the right locale to use in producing the 
>> language or locale sensitive content. Proprietary solutions are used 
>> today, but WS-I18N is defining the standard locale determination 
>> mechanism to resolve the problems. :-)  (yet to be defined & 
>> implemented)
>>
>> Regards,
>> -Dan
>>
>> Najib Tounsi wrote:
>>> Phillips, Addison wrote:
>>>> For those not on the unicode@ mailing list, you may find this note 
>>>> to be of interest.
>>>>
>>>> And yes the beach was very nice.
>>>>
>>>> Addison
>>>>
>>>> Addison Phillips
>>>> Globalization Architect -- Lab126
>>>>
>>>> Internationalization is not a feature.
>>>> It is an architecture.
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Phillips, Addison Sent: Monday, April 20, 2009 9:33 PM
>>>> To: 'Peter Krefting'; Unicode Mailing List
>>>> Cc: cldr-users@unicode.org
>>>> Subject: RE: Determining Locale in a Browser for Web 2.0 Applications
>>>>
>>>> A few notes on this thread. Note that these are *personal* 
>>>> comments, notwithstanding my .sig.
>>>>
>>>> "Language preference" isn't quite the same thing as "locale", 
>>>
>>> Yes.
>>> For example (same for other parts of the world, I imagine 
>>> (historical reasons?)), I am in Morocco, my "locale" 
>>> (native/official language) is supposed to be Arabic, but I browse 
>>> the web in English and Frech.
>>> My "Language Preference" is set to the En, then Fr, then Ar.  So, 
>>> these values are used as "A-L" by my browser.
>>> When I know there is an Arabic version of a site, I go to it 
>>> explicitly (e.g. News, e-gov infos...).
>>>
>>> I am don't always agree with sites assuming "I'am in Morocco, so I 
>>> want Arabic Web-pages". This is not straightforward.
>>> When I type google.com, I am redirected to google.co.ma. Well, I can 
>>> set my "Interface Language" to English (Arabic is default).
>>>
>>> NB:
>>> - with http://www.google.co.ma/ I get also
>>> "Google.co.ma offered in: Français 
>>> <http://www.google.co.ma/setprefs?sig=0_vmu-277buNDn_k70ZpuPvfVHVC4=&hl=fr> 
>>> العربية 
>>> <http://www.google.co.ma/setprefs?sig=0_vmu-277buNDn_k70ZpuPvfVHVC4=&hl=ar> 
>>> ", *both* Arabic and Frensh, when interface langage is English
>>> "Le domaine Google.co.ma est disponible en : العربية 
>>> <http://www.google.co.ma/setprefs?sig=0_KIl2jBrZrxzklJw4eI07JXiGFcA=&hl=ar> 
>>> ", when interface langage is French
>>> When interface language is Arabic, I am offered (only?) the French 
>>> interface language.
>>> Well, Morocco is also a Francophone country.
>>>
>>> - About google-translation page. My usual translations are from 
>>> English to Arabic. I wonder why the  default selection is set to 
>>> "spanish to english"?
>>>
>>> Regards,
>>>
>>> Najib.
>>>> although they are closely related. Locale is a programming concept 
>>>> useful in many ways, but mostly to do with APIs.
>>>>   The Accept-Language header was intended to do language 
>>>> negotiation, but since implementation of it is inconsistent and 
>>>> since managing it is quite arcane, language negotiation via 
>>>> Accept-Language (A-L) alone is usually not fully satisfying. Sites 
>>>> that rely solely on A-L eventually tend to migrate to some form of 
>>>> personalization scheme (such as cookies) to track the actual user 
>>>> preference---even Google does this today. [Implementers should read 
>>>> and understand RFC 4647 and the "lookup" algorithm to avoid spotty 
>>>> performance such as that cited by Peter Krefting below. RFC 2616 is 
>>>> just too vague to make an effective algorithm.]
>>>>
>>>> "Navigator.language" is sometimes a synonym for A-L, however, 
>>>> knowing the language isn't all that useful in the browser, since 
>>>> JavaScript-the-language has no locale facet and locale-based 
>>>> formatting is not under programmatic control. Typically the 
>>>> JavaScript locale matches the system or user default locale where 
>>>> the browser is running, so locale-specific formatting ends up being 
>>>> a server-side task (or it risks being inconsistent with the 
>>>> server-side content). In XmlHttpRequests (for "REST" style or 
>>>> so-called "Web 2.0" interactions) one often sees the locale being 
>>>> transmitted using the A-L header, since programmers assume that's 
>>>> what the header is for, with the value poked into the header being 
>>>> stored as a session variable of some kind.
>>>>
>>>> Geolocation is not as bad as Peter Krefting makes it sound below. I 
>>>> know my general reaction to is has been negative---just because I'm 
>>>> in the Frankfurt airport doesn't mean I want German content, to pay 
>>>> in Euros, etc. However, geolocation can be exceedingly useful for 
>>>> finding "locality", local resources, or when all else fails 
>>>> (uncookied A-L-free browser pointed at a generic URI).
>>>>
>>>> Most sites that do language/locale negotiation end up providing 
>>>> some form of user interaction for managing the language following 
>>>> the negotiation process (hence the prevalence of cookie-ing or 
>>>> URL-rewriting) so that users can get what they want. With multiple 
>>>> ways of getting it wrong, you have to allow the user to adapt.
>>>>   Overall, the whole thing is a bit of a patchwork mess. Each Web 
>>>> technology seems to choose a different approach, none of which are 
>>>> wholly wrong. And, indeed, there is work to try and address this at 
>>>> W3C. Specifically, the I18N WG is trying to complete work on two 
>>>> documents: "LTLI" (Language Tags and Locale Identifiers) and 
>>>> WS-I18N, promoting other standards (CLDR! IETF BCP 47!) and trying 
>>>> to lobby other W3C working groups (for example, WebApps, sometimes 
>>>> with some success to provide for consistent approaches.
>>>>
>>>> Do note that the latest BCP 47 (RFC4646bis) is in last call at the 
>>>> IETF right now. One thing browser vendors could do is implement it, 
>>>> since that would address some gaps in language coverage as well as 
>>>> the problem of script identification in locale identifiers.
>>>>
>>>> And anyone interested should really consider participating in the 
>>>> W3C Internationalization WG. We could use the help.
>>>>
>>>> Regards,
>>>>
>>>> Addison
>>>>
>>>> Addison Phillips
>>>> Globalization Architect -- Lab126
>>>> Chair -- W3C Internationalization WG
>>>>
>>>> Internationalization is not a feature.
>>>> It is an architecture.
>>>>   
>>>
>>> -- 
>>> Najib TOUNSI (mailto:tounsi @ w3.org)
>>> W3C Office in Morocco (http://www.w3c.org.ma/)
>>> cole Mohammadia d'Ingénieurs, BP 765 Agdal-RABAT Maroc (Morocco)
>>> Phone : +212 (0) 537 68 71 50 (P1711)  Fax : +212 (0) 537 77 88 53
>>> Mobile: +212 (0) 661 22 00 30 
>>
>
>

Received on Monday, 27 April 2009 18:46:40 UTC