Re: FW: Determining Locale in a Browser for Web 2.0 Applications

Dan Chiba wrote:
> Thank you Addison for the interesting topic and Najib for the 
> excellent example.
>
> My opinion is that the advances in the i18n technologies for locale 
> support such as BCP 47, CLDR, WS-I18N and LTLI are really great. 
> Meanwhile, I think it is unfortunate that we still often see 
> nonconforming behavior on the Web when these specifications are 
> applicable to the use case. I also think it is obscure why Google uses 
> location to choose the default UI language. It is a drop in the 
> bucket, there is a large number of problems due to inadequate locale 
> negotiation.

I don't know if Google is to blame on that. It seems to me that it has a 
quite subtle localization policy. I think that Google share a very good 
intension to offer its interfaces in the native language of the user, 
and it tries to do the best it could, trying to guess from user's 
location and preferences.

At the first launch of Google on my new computer, I was happy to see the 
UI in Arabic. In addition, I am offered another opportunity in French. 
The icing on the cake, when (against all!) I choose the English 
interface, I am offered (or rather encouraged to go to) two other UIs, 
Arabic and French, probably knowing that I speak both languages. Morocco 
is also a francophone country.

It is worth to note however that my Google English (may be generic) 
interface is not the same as my (localised) French and Arabic 
Interfaces. The former doesn't offer to search in locale 
(Arabic/Francophone or Moroccan) pages.

We may note also that, in Google's preference page (obtained by 
http://www.google.com/preferences?hl=en) the languages in the list are 
all written in the same language as the interface, and NOT in the target 
language, as usually recommended by BPs. (Try different values for the 
hl parameter in the URL above).

I am not very familiar with the RFC specs and the state of their 
implementation, but perhaps the problems of locale negotiation are not 
only technical, "or question of implementation".

Regards, Najib

>
> I suppose this unfortunate situation is because supporting 
> implementation is not widely available. For example, Java is two 
> generation obsolete from BCP 47 and trying to catch up in the locale 
> enhancement project <http://sites.google.com/site/openjdklocale/Home>. 
> I wish this project included more requirements, especially the 
> matching part of BCP 47 - RFC 4646 and CLDR are covered but little of 
> 4647. My FireFox doesn't allow me to enter a BCP 47 tag, while I can 
> enter one in IE as a user defined tag.
>
> One of the most painful i18n problems in Java Web applications is 
> neglecting locale negotiation. A developer may think taking the locale 
> from the request object or JSF view locale and passing it to the 
> ResourceBundle which implements a locale determination algorithm is 
> all it takes. This model sometimes works fine to serve a small number 
> of languages or locales, but it has a fundamental issue; there is no 
> way the instances of the resource bundle to tell which locale to use, 
> because they don't know the language preference of the user. For 
> example, for an application to serve Najib, the appropriate algorithm 
> to find the resource is looking for en first, fr next, then ar. This 
> is as defined in RFC 4647. However nowhere in Java we can find this 
> implemented. View locale determination of JSF is close to this, but 
> its model is often ineffectual due to the fact that the negotiated 
> locale cannot be used to choose both translation language and the 
> locale used in other locale sensitive operations such as datetime and 
> number formatting. For example, a conventional date format in Morocco 
> may be preferred, even if the UI language is English, French or other 
> foreign languages.
>
> And in the increasing number of scenarios where web content is 
> generated in a remote service, we have a big question mark for the way 
> the service determines the right locale to use in producing the 
> language or locale sensitive content. Proprietary solutions are used 
> today, but WS-I18N is defining the standard locale determination 
> mechanism to resolve the problems. :-)  (yet to be defined & implemented)
>
> Regards,
> -Dan
>
> Najib Tounsi wrote:
>> Phillips, Addison wrote:
>>> For those not on the unicode@ mailing list, you may find this note to be of interest.
>>>
>>> And yes the beach was very nice.
>>>
>>> Addison
>>>
>>> Addison Phillips
>>> Globalization Architect -- Lab126
>>>
>>> Internationalization is not a feature.
>>> It is an architecture.
>>>
>>>
>>> -----Original Message-----
>>> From: Phillips, Addison 
>>> Sent: Monday, April 20, 2009 9:33 PM
>>> To: 'Peter Krefting'; Unicode Mailing List
>>> Cc: cldr-users@unicode.org
>>> Subject: RE: Determining Locale in a Browser for Web 2.0 Applications
>>>
>>> A few notes on this thread. Note that these are *personal* comments, notwithstanding my .sig.
>>>
>>> "Language preference" isn't quite the same thing as "locale", 
>>
>> Yes.
>> For example (same for other parts of the world, I imagine (historical 
>> reasons?)), I am in Morocco, my "locale" (native/official language) 
>> is supposed to be Arabic, but I browse the web in English and Frech.
>> My "Language Preference" is set to the En, then Fr, then Ar.  So, 
>> these values are used as "A-L" by my browser.
>> When I know there is an Arabic version of a site, I go to it 
>> explicitly (e.g. News, e-gov infos...).
>>
>> I am don't always agree with sites assuming "I'am in Morocco, so I 
>> want Arabic Web-pages". This is not straightforward.
>> When I type google.com, I am redirected to google.co.ma. Well, I can 
>> set my "Interface Language" to English (Arabic is default).
>>
>> NB:
>> - with http://www.google.co.ma/ I get also
>> "Google.co.ma offered in: Français 
>> <http://www.google.co.ma/setprefs?sig=0_vmu-277buNDn_k70ZpuPvfVHVC4=&hl=fr> 
>> العربية 
>> <http://www.google.co.ma/setprefs?sig=0_vmu-277buNDn_k70ZpuPvfVHVC4=&hl=ar> 
>> ", *both* Arabic and Frensh, when interface langage is English
>> "Le domaine Google.co.ma est disponible en : العربية 
>> <http://www.google.co.ma/setprefs?sig=0_KIl2jBrZrxzklJw4eI07JXiGFcA=&hl=ar> 
>> ", when interface langage is French
>> When interface language is Arabic, I am offered (only?) the French 
>> interface language.
>> Well, Morocco is also a Francophone country.
>>
>> - About google-translation page. My usual translations are from 
>> English to Arabic. I wonder why the  default selection is set to 
>> "spanish to english"?
>>
>> Regards,
>>
>> Najib.
>>> although they are closely related. Locale is a programming concept useful in many ways, but mostly to do with APIs.
>>>   
>>> The Accept-Language header was intended to do language negotiation, but since implementation of it is inconsistent and since managing it is quite arcane, language negotiation via Accept-Language (A-L) alone is usually not fully satisfying. Sites that rely solely on A-L eventually tend to migrate to some form of personalization scheme (such as cookies) to track the actual user preference---even Google does this today. [Implementers should read and understand RFC 4647 and the "lookup" algorithm to avoid spotty performance such as that cited by Peter Krefting below. RFC 2616 is just too vague to make an effective algorithm.]
>>>
>>> "Navigator.language" is sometimes a synonym for A-L, however, knowing the language isn't all that useful in the browser, since JavaScript-the-language has no locale facet and locale-based formatting is not under programmatic control. Typically the JavaScript locale matches the system or user default locale where the browser is running, so locale-specific formatting ends up being a server-side task (or it risks being inconsistent with the server-side content). In XmlHttpRequests (for "REST" style or so-called "Web 2.0" interactions) one often sees the locale being transmitted using the A-L header, since programmers assume that's what the header is for, with the value poked into the header being stored as a session variable of some kind.
>>>
>>> Geolocation is not as bad as Peter Krefting makes it sound below. I know my general reaction to is has been negative---just because I'm in the Frankfurt airport doesn't mean I want German content, to pay in Euros, etc. However, geolocation can be exceedingly useful for finding "locality", local resources, or when all else fails (uncookied A-L-free browser pointed at a generic URI).
>>>
>>> Most sites that do language/locale negotiation end up providing some form of user interaction for managing the language following the negotiation process (hence the prevalence of cookie-ing or URL-rewriting) so that users can get what they want. 
>>> With multiple ways of getting it wrong, you have to allow the user to adapt.
>>>   
>>> Overall, the whole thing is a bit of a patchwork mess. Each Web technology seems to choose a different approach, none of which are wholly wrong. And, indeed, there is work to try and address this at W3C. Specifically, the I18N WG is trying to complete work on two documents: "LTLI" (Language Tags and Locale Identifiers) and WS-I18N, promoting other standards (CLDR! IETF BCP 47!) and trying to lobby other W3C working groups (for example, WebApps, sometimes with some success to provide for consistent approaches.
>>>
>>> Do note that the latest BCP 47 (RFC4646bis) is in last call at the IETF right now. One thing browser vendors could do is implement it, since that would address some gaps in language coverage as well as the problem of script identification in locale identifiers.
>>>
>>> And anyone interested should really consider participating in the W3C Internationalization WG. We could use the help.
>>>
>>> Regards,
>>>
>>> Addison
>>>
>>> Addison Phillips
>>> Globalization Architect -- Lab126
>>> Chair -- W3C Internationalization WG
>>>
>>> Internationalization is not a feature.
>>> It is an architecture.
>>>   
>>
>> -- 
>> Najib TOUNSI (mailto:tounsi @ w3.org)
>> W3C Office in Morocco (http://www.w3c.org.ma/)
>> cole Mohammadia d'Ingénieurs, BP 765 Agdal-RABAT Maroc (Morocco)
>> Phone : +212 (0) 537 68 71 50 (P1711)  Fax : +212 (0) 537 77 88 53
>> Mobile: +212 (0) 661 22 00 30 
>

Received on Sunday, 26 April 2009 17:52:42 UTC