Re: [Comment on WS-I18N WD] from Felix Sasaki on 2008-06-14 (www-international@w3.org from April to June 2008)

From: Felix Sasaki <fsasaki@w3.org>
Date: Sat, 14 Jun 2008 20:42:14 +0900
To: Dan Chiba <dan.chiba@oracle.com>
CC: www-international@w3.org
Message-ID: <4853AE96.50407@w3.org>
Dan Chiba さんは書きました:
>
> Hi Addison,
>
> Thank you for looking over my comments. Please see my responses below:
>
> Phillips, Addison wrote:
>> Hi Dan,
>>
>> (Chair/Editor hats off. These are personal comments.)
>>
>> I've looked over your comments. In general I'm okay with them, 
>> although I do have a few things to note:
>>
>> 1. The existing document has an <i18n:preferences> element that, 
>> among other things, can contain some of the information you are 
>> looking for. In particular, it already contains items 6-9 on your 
>> list. The point of having a preferences element, in my mind, was to 
>> provide access to these specific settings for cases where one needs 
>> such detail. Can you better enumerate why these should be promoted to 
>> full-fledged elements?
>>   
> Explicitly specifying how to indicate the common details will help 
> achieve interoperability and promote use of this mechanism. Because 
> LDML is the way to describe and exchange locale definitions and not 
> designed to indicate locale settings, it is complex and becomes 
> ambiguous when it used under the <i18n:preferences> element.
>
> For example, how to identify a collation is unclear. In the current 
> draft, German phonebook collation is represented as follows:
>
> (03)   <i18n:preferences>
> (04)     <ldml:collation>
> (05)       <ldml:alias source="de_DE" type="phonebook"/>
> (06)     </ldml:collation>
> (07)   </i18n:preferences>
>
> A corresponding example in LDML looks like this:
>
>   <collation type="phonebook">
>     <alias source="de_DE">
>   </collation>
>
> First of all, the location of the type attribute is different but I 
> suppose this should be matching. Assuming this was matching, still the 
> pair of attribute values "de_DE" and "phonebook" would not identify a 
> specific collation. i18n:preferences/ldml:collation does not carry 
> collation definition data, so it is ambiguous what "de_DE" and 
> "phonebook" mean. I think the collation registry should be used.
>
> Also, here is another ambiguous scenario. What date format is this? 
> What is this length (short?medium?long?full?)? What calendar? Is this 
> a valid example in the first place?
>
> (03)   <i18n:preferences>
> (04)    <ldml:dateFormat>
> (05)     <ldml:pattern>MMMM d, yyyy</ldml:pattern>
> (06)    </ldml:dateFormat>
> (07)   </i18n:preferences>
>
>> 2. I've seen requests for a UI language separate from locale before, 
>> but I'm not sure that they make a lot of sense. Which takes 
>> precedence? What does it mean to have a German locale but French UI 
>> messages? Other than writing I18N demos, what use case do you have 
>> for this?

I have a use case which I run into daily: I'm using a search engine with 
an English or German user interface, but I want to get results from the 
area of Japan. Similarly I can imagine that Japanese people speaking 
"some" English would prefer a Japanese user interface, but to have 
results from the area they are traveling in / living in.
Sorry for being specific to a search engine ... but compare as an 
example the results of
http://www.google.de/search?hl=de&q=pizza
and
http://www.google.de/search?hl=ja&q=pizza
or
http://search.yahoo.com/search?p=pizza
and
http://de.search.yahoo.com/search?p=pizza
I prefer to get these differences in results, but with the same user 
interface (the UI of the search engine).


>>   
> Internationalized software often supports different sets of languages 
> and locales. A software project typically find support for many 
> locales in the technology stack (e.g. date formatting), while the 
> project may not have the resources to support as many languages. 
> (Support for locales is usually free or cheap, language varies, can be 
> very expensive.) So quite often product supports greater number of 
> locales than translation. Then, each user is usually served in their 
> preferred locale. However, the preferred language may not be 
> supported, and then an alternate language would have to be chosen. The 
> language item in my #3 is to indicate this language.
>
>> My concern is that it will be very difficult for people to understand 
>> the separate element's uses, especially since each of them is then 
>> exposed to the BCP 47 Lookup negotiation mechanism. If we were to 
>> make some changes here it would be to make <i18n:locale> a language 
>> priority list for requests and a single-item for responses.
>>
>>   
> Yes actually I think it can be very difficult to orchestrate services 
> to use appropriate locales for each service and product the desired 
> behavior as a whole web service application, even after WS-I18N is 
> completed and becomes available for developers. My understanding is 
> that this version of WS-I18N specification does not define locale 
> negotiation.

I think your understanding is correct. Note that IIRC , when we had 
discussed to work on WS-I18N as a "normative deliverable", we had 
excluded locale negation as explicitly out of scope for this document, 
because of the issues described.

Felix

>> 3. (Felix) The examples of locale identifiers should be consistent in 
>> their use of - or _ for separators. Excepting the special values 
>> $default and $neutral, I think we should mandate the use of BCP 47 
>> (ie hyphens) here.
>>
>> 4. Charset, IMO, is a bad idea. I am not sure of a use case for it. 
>> Would it imply that the response should use a specific encoding for 
>> attachments or for the SOAP message? Isn't this the job of 
>> Content-Type? I'm sure we can think of some very specific cases that 
>> imply it, but it strikes me that the best way to discourage Bad 
>> Behavior for this sort of thing is to make people create their own, 
>> separate policy item for encoding management when they need it. 
>> (We've spent years getting people used to the idea that Unicode is a 
>> Good Thing, especially on the wire and that if you need some other 
>> encoding you should transcode to/from it on your end.)
>>   
> I think there are use cases because there are data in native 
> encodings. We promote Unicode in every chance but in some cases it 
> does seem a better idea to not force them to Unicode. For example, 
> both of consumer and provider may have a native encoding, then forcing 
> the service communication to Unicode may sound irrational. I agree it 
> is the job of Content-Type to indicate the charset of a content. This 
> might be used to indicate preferred charsets (reminds Accept-Charset).
>
> Thank you,
> -Dan
>> Best Regards,
>>
>> Addison
>>
>> Addison Phillips
>> Globalization Architect -- Lab126
>>
>> Internationalization is not a feature.
>> It is an architecture.
>>   
>
>
>
Received on Saturday, 14 June 2008 16:35:57 UTC