RE: ICYMI: [manifest] I18N review in progress

On March 4, 2015 at 2:57:17 AM, Phillips, Addison (addison@lab126.com) wrote:
> Some personal comments follow.
> I think there are two problems here. This problem has to do with metadata *about* the localization  
> of a given Web application--that is, what languages/locales it is available in.

Yes, precisely: the manifest is not the place to convey what languages a web application itself supports. Establishing the language can be done through other mechanisms away from the manifest. 

>  There are many ways that a Web application might negotiate language with the user (Accept-Language  
> header, navigator.language, IP address sniffing, user cookie, usw.) and these cannot  
> easily be expressed in the manifest.

Precisely. The manifest is generated as a result of input from the above. 

>  Maybe this is irrelevant to what you're trying to do here.

It is relevant in as far as generating a manifest needs to work well with all the things you mentioned above (Accept-Language, etc.). 

> > Furthermore, the manifest is a lazy-loaded (or potentially never loaded!)
> > resource, hence can't be relied on to provide authoritative information about
> > the languages of application.
>  
> This seems extremely suspect to me. The manifest can authoritatively provide the endpoint,  
> name, and icon for the application, but not metadata about it?

The name and icon are only ever used in some kind of "installation" process. However, it does not provide extensive metadata about a web application (note the exclusion of things like: author, description, etc. which are commonly used by applications stores - this is by design.)

In general browsing, the name is within HTML documents via:

<title>Potentially the name of the application</title> 
<!-- more authoritative -->
<meta name="application-name" content="The Application's Name">

And a series of icons via: 
<link rel="icon">

Both Gecko's and Chrome's implementations are specifically design to lazy-load the manifest. Where values are missing, the manifest processor always falls back to the web page for its icons and name.  

> >
> > The background for this decision is here:
> > https://github.com/w3c/manifest/issues/208
>  
> I happen to also agree with the original issue here. I agree that the manifest specification  
> is not the place to define language negotiation. If you had a specification that detailed  
> how language negotiation *must* be done, then you could address 208 fully.
>  
> However, I would tend to agree with commenters on 208 that providing non-authoritative  
> metadata about application localization is useful.

To me, that's like saying that a HTML document should contain translations for every language: 

<p lang="en">...</p>
<p lang="es">...</p>
<p lang="ja">...</p>

That would only add bloat, IMO. My goal is to avoid the manifest being the equivalent of the above. 

> >
> > > What do authors of localized web applications do?
> >
> > Our proposals is that authors localize content on the server instead. So:
> >
> >  
>  
> :-P
>  
> How do I know to do that?

As a developer, the spec doesn't give you a choice. Thus one is forced into a content-negotiation situation. 

I'd be happy to add a note to the spec about this. 

>  How do I know what languages are available? (Note: the language  
> tag for Japanese is 'ja')

As I show in 208:
https://github.com/w3c/manifest/issues/208#issuecomment-44298944

The user selects the language supported by the web application from a list of languages, as provided by the UI of the Web application. Such approaches are common practices across the Web.  

> > In case of another language swap, the author simply updates the href
> > attribute of the link element to point to the correct localized manifest (by
> > either refreshing the page or using JS).
>  
> What if you have multiple languages for the same Web application? Would you do:

Do you have examples of such web applications? Applications will generally have a dominant language for the UI, not multiple (though, of course, other languages can be used within them).    

> ... etc...
>  
> I think you're making the incorrect assumption that 1 page == 1 language == 1 manifest  

Again, I need evidence to the contrary. I've never seen an application that doesn't follow the "X page == 1 language == 1 manifest model". That's not to say that there is not, but would really like to see some so to evaluate how to best make manifests work for such sites. 

> Let's take this in two parts:
>  
> Bidi/Direction: I suppose it would be sufficient from the point of view that a sufficiently  
> motivated speaker of a bidirectional language could make it work. But it is hugely inconvenient.  
> In markup languages we generally discourage the Unicode bidi controls. While the manifest  
> isn't a "markup language" per-se, its contents are generally intended to be used in this  
> kind of context. Bidi isolation alone is insufficient.

This still sounds like a problem with text editors, not with unicode itself... 

> Manifests are likely to be generated automatically, which would thus require having  
> the bidi controls in a context removed from the manifest itself. This creates additional  
> problems for bidirectional language users: they have to know to insert the controls  
> in situations where they would normally use markup.

< removes editor hat... put's on personal hat > 
Arrgh! I know, this is tremendously annoying and sad. I argued that JSON is a terrible format for this kind of data but I lost that battle. We'd solved all this in the W3C widget's config.xml format, please kill me now ;-( ... anyway... much sadness. Unfortunately, as much as I hate it, I have to keep defending the use of JSON. The JSON ship has, um, literally shipped. 
< puts on Editor hat ... pretends to like JSON again >  

But there they could just use XML -> XSLT -> JSON, or some such. Like you said, JSON is not a markup language and we shouldn't try to turn it into one. And yes, it would be a total pain in the ass, but it could be done. 

> Put another way, the availability of the Unicode bidi controls does not mean that we don't  
> need to provide an appropriate markup mechanism. Especially since we still need to provide  
> for some other data...

I don't know how we would do this, tbh. It would be kinda sad if we started adding XML or HTML-like fragments or tags into JSON.   

> Language: you still need to provide language information about the manifest's natural  
> language content items. Among other things, correct font selection rides on providing  
> language metadata to the text drawing layer. Inferring the language of the text by sniffing  
> the bits is inaccurate or impractical. The best practice is to provide this as additional  
> data.

That's fair. I'll add a `lang` member. 
https://github.com/w3c/manifest/issues/322

> Note that this is about information *inside* the manifest. I often have the case where  
> I need separate display strings that are localized for the *same* Web application--whether  
> the Web application is localized or not. Lacking a localization model within the manifest  
> means that there has to be a means of generating many manifests and then retrieving the  
> "correct" one for my needs.

Correct. 

> Further, the manifest itself won't contain what language  
> the strings inside it are in.
>  
> Hope this helps,

It does! thanks for the clarifications and explanations!  

Received on Wednesday, 4 March 2015 06:51:15 UTC