Re: Manifest internationalization Model

Hi Mounir,   

On Monday, 25 March 2013 at 16:50, Mounir Lamouri wrote:

> Hi all,
>  
> I think there are actually two problems here:
> - localisation of the application manifest;
> - localisation of the application content.
>  
> According to the topic title, this thread should be about the former but
> the body is confusing and most of the discussion happened regarding the
> later so I'm not sure what was the intended subject.
>  

I would urge us to focus on the former right now. As you described below, there are already a whole range of strategies and libraries that can be used to localised actual application content (it also does not require any native facilities beyond what is provided by JS).

I've documented most of the i18n behavior from FxOS already in my fork:
http://marcoscaceres.github.com/sysapps/proposals/RunTime-Security/Overview.html   
> Regarding content localisation, I believe this is an important problem
> for installed applications but this is a general problem for the offline
> experience with the Web Platform.
>  

Yep.  
> We should definitely have that happen
> in the HTML or WebApps worlds.
> FWIW, Mozilla has two proposals. A simple API that is used in Firefox
> OS, Webl10n [1] and a more complex one, l20n [2].
>  

I personally think that a content localisation model is probably out of scope for phase 1.  
> Regarding the manifest internationalisation, keeping the localised
> strings in the manifest seems the simpler thing to do given how few
> strings there are (and those strings are static). It has the advantage
> of keeping the manifest localisation self-contained and prevents using
> complex systems for a simple task (using the l20n grammar [3] to point
> to mywebsite.com/LANG/ (http://mywebsite.com/LANG/) or change the app title or description would be sad).
>  


Agreed.  
  
> For reference, FreeDesktop desktop entries (which are something that
> could be considered as manifest files on Desktop Environments on
> GNU/Linux) are having localised strings in the file instead of using an
> external localisation system [4].
>  
> [1] https://github.com/fabi1cazenave/webL10n
> [2] https://wiki.mozilla.org/L20n
> [3] http://l20n.github.com/spec/grammar.html
> [4] http://standards.freedesktop.org/desktop-entry-spec/latest/ar01s04.html
>  
> Thanks,
> --
> Mounir
>  
> On 11/03/13 17:11, Marcos Caceres wrote:
> > Hi,  
> > TL;DR: We need to settle on an i18n model for the manifest format. We have a few options here (based on the membership of the group):  
> >  
> > 1. Firefox OS's model [1].  
> > 2. Google packaged apps's [2].
> > 3. W3C Packaged Web Apps (widgets) i18n model [3].  
> >  
> > Each of the models have their own pros and cons. Below I describe each model.  
> >  
> > My recommendation is that we use the FireFox OS one, but with a few modifications:  
> >  
> > * Mozilla's i18n model currently only supports language tags with only two sub-tags. This should be fixed to allow three or more sub tags (I've got confirmation from Mozilla that they are willing to change this: https://bugzilla.mozilla.org/show_bug.cgi?id=846269). However, this depends on affected populations of people (see discussion on www-international: http://lists.w3.org/Archives/Public/www-international/2013JanMar/0305.html).  
> >  
> > * Language tag decomposition should follow BCP47's lookup algorithm (and an "application locale" should be derived from the union of the user agent locale and the manifest's declared locales).  
> >  
> > * The application locale should be exposed in JS through an interface.  
> >  
> > Ok… the rest is the long bla bla for each…. Will maybe put this into the Wiki or something.  
> >  
> >  
> >  
> >  
> >  
> > # Internationalization model of Firefox OS
> >  
> > A more human readable version of this section is at [1].  
> >  
> > This section lists a set of localization scenarios and describes how Firefox OS handles these different use cases. The section makes some recommendations about how particular use cases could be better addressed.
> >  
> > To simplify the discussion, this document assumes the runtime is running in locale "en-US".  
> >  
> > ## CASE 1 - No localisation information.  
> >  
> > *Use case:* The author does not wish to explicitly localize any content.  
> >  
> > ```JSON
> > {
> > "name": "foo"
> > }
> > ```  
> >  
> > *FxOS:* When there is no localized content declared, the user agent just uses what is at the root of the manifest. Hence, the name of the app is "foo".  
> >  
> > ## CASE 2 - No default locale
> >  
> > *Use case:* The author provides the required ```name``` member, but chooses to localize the application's name for the "en-US" locale. However, the author neglects to add the ```default_locale``` member.
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "locales": {
> > "jp": {
> > "name": "jp name"
> > },
> > "en-US": {
> > "name": "en-US name"
> > }
> > }
> > }
> > ```
> >  
> > *FxOS:* Despite there not being any ```default_locale```, the user agent still chooses "en-US name" as the name for the application. When neither localized content matches, the value at the root of the manifest is selected.  
> >  
> > ## CASE 3 - No default locale, multiple matching ranges
> >  
> > *Use case:* The author declares the name of the application using a set of variants of the English language. The author omits the ```default_locale``` member.  
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "locales": {
> > "en-US-x-test": {
> > "name": "en-US-x-test name"
> > }
> > "en-US": {
> > "name": "en-US name"
> > },
> > "en": {
> > "name": "en name"
> > }
> > }
> > }
> > ```
> >  
> > *FxOS:* The user agent selects the localized content that exactly matches the user agent's default language (so, in this case, "en-US name" is shown).  
> >  
> > ## CASE 4 - No default locale, with catch all
> >  
> > *Use case:* The author declares the name of the application using a set of variants of the English language. However, none of them match the user agent's locale settings exactly. Fortunately, the author has included a catch all ("en").  
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "locales": {
> > "en-AU": {
> > "name": "en-AU name"
> > }
> > "en-GB": {
> > "name": "en-GB name"
> > },
> > "en": {
> > "name": "en name"
> > }
> > }
> > }
> > ```
> >  
> > *FxOS:* The user agent first checks for "en-US", but failing that, it selects the next best match, which is "en name".  
> >  
> > ## CASE 5 - No default locale, multiple matching ranges
> >  
> > *Use case:* The author wants to localize the name, but does not need to localize the developer information.  
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "developer": {
> > "name": "unknown-locale author"
> > },
> > "locales": {
> > "en-US": {
> > "name": "en-US name"
> > },
> > "jp":{
> > "name": "jp name"
> > }
> > }
> > }
> > ```
> >  
> > *FxOS:* The user agent selects "en-US" as the name, and "unknown-locale author" as the author.
> >  
> > ## CASE 6 - No default locale, multiple matching ranges
> >  
> > *Use case:* the author wants to localize the developer name but not the developer URL.  
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "developer": {
> > "name": "unknown-locale author",
> > "url": "http://unknown-locale.com/"
> > },
> > "locales": {
> > "en-US": {
> > "developer": {
> > "name": "localized author"
> > }
> > }
> > }
> > }
> > ```
> >  
> > *FxOS:* The user agent selects the localized developer name and uses the unknown-locale developer ```url```.  
> >  
> > ## CASE 7 - Default locale
> > Use case: When the author uses any value for ```default_locale```, but no localized content is given through a ```locale``` member, the author still expects some content to be displayed to the user (even if they user might not be able to understand it).  
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "developer": {
> > "name": "unknown-locale author"
> > },
> > "default_locale": "unknown-locale"
> > }
> > ```
> >  
> > *FxOS:* The user agent selects "unknown-locale name" as the name of the application.  
> >  
> > ## CASE 8 - Language tag decomposition and lookup
> > *Use case:*  
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "developer": {
> > "name": "unknown-locale author"
> > },
> > "locales": {
> > "en-US": {
> > "name": "en name"
> > },
> > "en": {
> > "developer": {
> > "name": "en developer"
> > }
> > }
> > },
> > "default_locale": "unknown-locale"
> > }
> > ```
> > The name of the app is "en name".  
> >  
> > *FxOS:* When neither the ```default_locale``` nor the user agent locale matches any localized content, FxOS just uses the first sub-tag of the language range (in this case, just "en"). So, "en-US" becomes "en" ([see code](https://mxr.mozilla.org/mozilla-central/source/dom/apps/src/AppsUtils.jsm#427)).  
> >  
> > *[BUG 846269](https://bugzilla.mozilla.org/show_bug.cgi?id=846269)*: Because FxOS currently takes the first subtag in a language range, this will exclude language ranges initially composed of three or more subtags (e.g., zh-Hans-XQ). This means that if the user's regional preference is expressed as "zh-Hans-XQ", the following will not match any localized content (when zh-Hans would have been a reasonable match):
> >  
> > ```JSON
> > {
> > "name": "unknown-locale name",
> > "developer": {"name": "unknown-locale author"},
> > "locales": {
> > "zh-Hans": {
> > "name": "zh-Hans name"
> > }
> > },
> > "default_locale": "unknown-locale"
> > }
> > ```
> >  
> > ## CASE 9 - Granularity
> > Use case: The author wishes for the name of the application to be localized for a particular locale. However, the author does not want to duplicate the developer information.  
> >  
> > ```JSON
> > {
> > "name": "您好!颜色",
> > "locales": {
> > "en-US": {
> > "name": "Hi! Color"
> > },
> > "en-AU": {
> > "name": "G'Day! Colour"
> > },
> > "en": {
> > "developer": {
> > "name": "en developer"
> > }
> > }
> > },
> > "developer": {
> > "name": "中国开发者"
> > },
> > "default_locale": "zh-Hans"
> > }
> > ```
> >  
> > *FxOS:* The user agent selects "Hi! Color" as the name, but then selects "中国开发者" as the developer. Expected ```"name": "en developer"``` to be selected. Filed [BUG 846432](https://bugzilla.mozilla.org/show_bug.cgi?id=846432).
> >  
> > *Proposal:* To address the above, the user agent should arrange the user's preferred locales and decompose them in order (removing any duplicates). So, if the user has "en-US, en-AU, jp" as her preferred language settings, those would decompose to "en-US, en-AU, en, jp".  
> >  
> >  
> > # Chrome i18n Model
> >  
> > Chrome's i18n model [2] for localisation of manifest data differs quite significantly from the Mozilla one and the W3C widgets one [3]. In particularly, it's vastly more complicated in that it follows a more traditional software 18n model - some aspects are tightly bound to the Chrome apps store for some fields.
> >  
> > Instead of allowing manifest content to be localized within the application manifest itself, all localized content is put into "messages.json" files in a special "_locales/language-TAG" directory (where language-TAG is, for example, en-US). The developer is then required to "key" all localised data and then Chrome reconstructs the localised content by matching keys to a magic string (__MSG_*__). For example:
> >  
> > manifest.json
> > ==========
> > "name": "__MSG_application_title__", "description": "__MSG_application_description__"
> >  
> >  
> > _locales/de/messages.json
> > ===============
> > "application_title": { "message": "Eine lokalisierte gehostete Beispielanwendung" }
> >  
> >  
> > Then the developer declares what the "default_locale" is, which works as a catch-all for when the user's locale does not match the locale of the application.
> >  
> > Chrome then provides an API to access localised strings from within the application itself. Having the API is not a bad thing (as it takes away some of the burden of having to read files from within a package through either XHR or a file reader API), but it does lock the developer into using a particular i18n model.
> >  
> > Other none standard features include using custom language tags [2]. These language tags are non-standard in that they don't conform to BCP 47. As such, some region, language, script combinations cannot, theoretically speaking, be adequately expressed. If this affects actual populations of people, I am unsure of.
> >  
> >  
> > # W3C Packaged Web Apps (widgets) 18n Model.  
> > The model is already fully described [3] - with plenty of examples. It supports both manifest level localizations as all directory-based localisation in a manner similar to Google's.  
> >  
> > http://www.w3.org/TR/widgets/#internationalization-and-localization
> >  
> > [1] https://gist.github.com/marcoscaceres/5055717
> > [2] https://developers.google.com/chrome/web-store/docs/i18n
> > [3] http://www.w3.org/TR/widgets/#internationalization-and-localization
> > --  
> > Marcos Caceres
> > http://datadriven.com.au
> >  
>  
>  

Received on Tuesday, 26 March 2013 17:25:20 UTC