Re: [Comment on WS-I18N WD]

Hi Addison,

Thank you for looking over my comments. Please see my responses below:

Phillips, Addison wrote:
> Hi Dan,
>
> (Chair/Editor hats off. These are personal comments.)
>
> I've looked over your comments. In general I'm okay with them, 
> although I do have a few things to note:
>
> 1. The existing document has an <i18n:preferences> element that, among 
> other things, can contain some of the information you are looking for. 
> In particular, it already contains items 6-9 on your list. The point 
> of having a preferences element, in my mind, was to provide access to 
> these specific settings for cases where one needs such detail. Can you 
> better enumerate why these should be promoted to full-fledged elements?
>   
Explicitly specifying how to indicate the common details will help 
achieve interoperability and promote use of this mechanism. Because LDML 
is the way to describe and exchange locale definitions and not designed 
to indicate locale settings, it is complex and becomes ambiguous when it 
used under the <i18n:preferences> element.

For example, how to identify a collation is unclear. In the current 
draft, German phonebook collation is represented as follows:

(03)   <i18n:preferences>
(04)     <ldml:collation>
(05)       <ldml:alias source="de_DE" type="phonebook"/>
(06)     </ldml:collation>
(07)   </i18n:preferences>

A corresponding example in LDML looks like this:

   <collation type="phonebook">
     <alias source="de_DE">
   </collation>

First of all, the location of the type attribute is different but I 
suppose this should be matching. Assuming this was matching, still the 
pair of attribute values "de_DE" and "phonebook" would not identify a 
specific collation. i18n:preferences/ldml:collation does not carry 
collation definition data, so it is ambiguous what "de_DE" and 
"phonebook" mean. I think the collation registry should be used.

Also, here is another ambiguous scenario. What date format is this? What 
is this length (short?medium?long?full?)? What calendar? Is this a valid 
example in the first place?

(03)   <i18n:preferences>
(04)    <ldml:dateFormat>
(05)     <ldml:pattern>MMMM d, yyyy</ldml:pattern>
(06)    </ldml:dateFormat>
(07)   </i18n:preferences>

> 2. I've seen requests for a UI language separate from locale before, 
> but I'm not sure that they make a lot of sense. Which takes 
> precedence? What does it mean to have a German locale but French UI 
> messages? Other than writing I18N demos, what use case do you have for 
> this?
>   
Internationalized software often supports different sets of languages 
and locales. A software project typically find support for many locales 
in the technology stack (e.g. date formatting), while the project may 
not have the resources to support as many languages. (Support for 
locales is usually free or cheap, language varies, can be very 
expensive.) So quite often product supports greater number of locales 
than translation. Then, each user is usually served in their preferred 
locale. However, the preferred language may not be supported, and then 
an alternate language would have to be chosen. The language item in my 
#3 is to indicate this language.

> My concern is that it will be very difficult for people to understand 
> the separate element's uses, especially since each of them is then 
> exposed to the BCP 47 Lookup negotiation mechanism. If we were to make 
> some changes here it would be to make <i18n:locale> a language 
> priority list for requests and a single-item for responses.
>
>   
Yes actually I think it can be very difficult to orchestrate services to 
use appropriate locales for each service and product the desired 
behavior as a whole web service application, even after WS-I18N is 
completed and becomes available for developers. My understanding is that 
this version of WS-I18N specification does not define locale negotiation.
> 3. (Felix) The examples of locale identifiers should be consistent in 
> their use of - or _ for separators. Excepting the special values 
> $default and $neutral, I think we should mandate the use of BCP 47 (ie 
> hyphens) here.
>
> 4. Charset, IMO, is a bad idea. I am not sure of a use case for it. 
> Would it imply that the response should use a specific encoding for 
> attachments or for the SOAP message? Isn't this the job of 
> Content-Type? I'm sure we can think of some very specific cases that 
> imply it, but it strikes me that the best way to discourage Bad 
> Behavior for this sort of thing is to make people create their own, 
> separate policy item for encoding management when they need it. (We've 
> spent years getting people used to the idea that Unicode is a Good 
> Thing, especially on the wire and that if you need some other encoding 
> you should transcode to/from it on your end.)
>   
I think there are use cases because there are data in native encodings. 
We promote Unicode in every chance but in some cases it does seem a 
better idea to not force them to Unicode. For example, both of consumer 
and provider may have a native encoding, then forcing the service 
communication to Unicode may sound irrational. I agree it is the job of 
Content-Type to indicate the charset of a content. This might be used to 
indicate preferred charsets (reminds Accept-Charset).

Thank you,
-Dan
> Best Regards,
>
> Addison
>
> Addison Phillips
> Globalization Architect -- Lab126
>
> Internationalization is not a feature.
> It is an architecture.
>   

Received on Saturday, 14 June 2008 08:21:27 UTC