- From: Dan Chiba <dan.chiba@oracle.com>
- Date: Thu, 19 Jun 2008 15:23:17 -0700
- To: Felix Sasaki <fsasaki@w3.org>
- CC: "Phillips, Addison" <addison@amazon.com>, "www-international@w3.org" <www-international@w3.org>
Felix Sasaki wrote: > > Dan Chiba さんは書きました: >> >> Phillips, Addison wrote: >>>>> On the other hand, does it make sense to advertise that a Web >>>>> >>>> service supports a locale that it has no messages for? If the >>>> service normally has no user interface ("formatDate", "addInts", >>>> "sortStrings"), then the list of available locales might very well >>>> match the complete set available in the API. At the other end of >>>> the spectrum are AJAX interactions that build the UI in real time. >>>> Then only the messages that you actually have available are useful >>>> to advertise. >>>> I think it makes sense to advertise the set of supported locales. >>>> >>> >>> That would tend to be the point of this work: we provide a way to >>> say any of the following: >>> >>> - this service is locale-neutral; you may specify a locale, but it >>> doesn't do anything to the service >>> - this service has a specific default locale that it uses ("it is >>> always in German"); they user can specify whatever they want, but >>> the service always uses this one >>> - this service has some specific (and specified) list of available >>> locales (and by inference some default); the user may specify the >>> locale to use and the service will do its best to match it from the >>> specified list >>> - this service is locale sensitive; the user may specify the locale >>> to use and the service will do its best to match it, noting that a >>> list is not provided >>> >> I think it is very desirable to provide a way to discover supported >> locales as well. Then it would be possible for the service consumer >> to specify the desired locale, knowing the locale will be used for >> the service operation. Generally, the locale should be determined >> based on the policy > > the WS Policy framework does not provide a policy negotiation > mechanism. I would be very reluctant to spend time on developing such > a mechanism. Although I understand your desire, I don't think that we > should spend time on this (see my remark on timing below). > I agree, let's not cover this point at this time. Regards, -Dan > >> defined by the consumer, not by the provider. Otherwise the resulting >> behavior would become unpredictable; likely to result in user >> experiences with mixed languages. >>>> It may >>>> be the list of available translation languages, formatting locales, >>>> those locales for which linguistic sorting behavior is supported, >>>> or something alike. >>>> >>> >>> Yes, and we need to support the service implementer making the >>> decision about which pattern to advertise and/or use. You and I >>> might choose entirely different criteria for choosing how we >>> advertise locale support for a given service. >>> >>> >>>> Because a service cannot determine the appropriate >>>> locale for the locale sensitive service operation, it needs to be >>>> made >>>> possible for the service consumer to discover what locale is >>>> supported, >>>> in order for the application to produce the desired UI behavior. >>>> >>> >>> I agree, with a nit: >>> >>> - sometimes it doesn't make sense to list everything that is >>> available. Sometimes it is better (consumes less bandwidth, >>> processing, etc.) to say: "I'll do my best to match your request". >>> This can even make sense when the list is quite short. >>> >> I agree it is sometimes unnecessary for consumers to know what >> locales are supported. In other cases, as mentioned, people may >> dislike mixed languages on UI and an application needs to control the >> locales in which the service operates. Suppose an application UI had >> three sections each presenting text information from different >> services, the user experience may be better if their language is the >> same. If the information is dated, the date format would be expected >> to be consistent. >>>>> My concern here is that many services fall into a sort of middle >>>>> >>>> category: they can service many locales, but only have a limited >>>> set of localizations. Messages from the services are necessarily >>>> constrained to the smaller set, while the service might actually be >>>> useful for a larger set. >>>> Yes that is why we think the translation locale and the locale for >>>> other purposes should be identified separately. >>>> >>> >>> I understand that's your intent. However I think this will confuse >>> the vast preponderance of developers who have only a very rough idea >>> what a "locale" or a "language" is. There are also different ways >>> that services can be provisioned. It may not be possible to >>> enumerate one list or the other easily. Having two things that do >>> roughly the same thing doesn't seem that useful to me. How often do >>> you actually set LC_MESSAGES separately from LC_ALL? >>> >> Whenever a user's preferred locale is supported but preferred >> language is not, he or she would set LC_MESSAGES explicitly. This is >> often needed because the set of supported translation languages is >> usually small. I wonder if you also mean LC_MESSAGES is confusing or >> not needed because it won't be set often and does not seem so useful. >> >> Because the sets of supported languages and locales are usually >> different, they are practically different. I do agree they are >> conceptually roughly the same. However, in reality, service consumers >> are usually interested in serving the user in their most preferred >> available language and locale, but this is hard to achieve without >> specifying the locale and language separately. Both of users' >> preferred locale and language should be honored, but too often >> language resources are not available and an alternative language >> different from the language deduced from the preferred locale ought >> to be used instead. This alternative language needs to be identified >> and this is why #3 language (and LC_MESSAGES) is needed. >>>>> My tendency is still to think that this is "locale" and not >>>>> >>>> "language". It looks like a bug to get a message like: "There were >>>> « 1 234 » entries sorted on 14 juin." Where the locale was clearly >>>> one thing and the messages in another language. >>>> Having both #1 locale and #3 language does not mean that would >>>> produce >>>> the odd message. If using the same locale for the message >>>> formatting is >>>> a requirement, the component can use #3 language alone to make the >>>> message locale consistent. >>>> >>> >>> >>> But this is inconsistent with the design of WS-I18N, where "locale" >>> is the "big knob". I tend to think that relatively few people would >>> know how to write an application like this. >>> >> WS-I18N needs a little knob to deal with the fact that translation >> resources are missing in many use cases. Usually "locale" identifies >> the user's preferred locale, which is usually supported. "language" >> may be deduced from "locale", however, support for the preferred >> language is often not available, so the alternative language must be >> identified. >>> A better solution might be: if we provide a list of available >>> locales, we can provide an additional attribute to indicate which >>> ones have been provisioned with messages. For example: >>> >>> <i18n:locale> >>> <i18n:option default="true" localized="true">en-GB</i18n:option> >>> <i18n:option localized="true">de</i18n:option> >>> <i18n:option>fr</i18n:option> >>> </i18n:locale> >>> >>> Here the default locale is "en-GB". German ("de") is also available, >>> with localizations, as is French ("fr"), sans localization. >>> >>> A request could come in as something like: >>> >>> <i18n:locale>en-US,de-CH-1994,fr</i18n:locale> <!-- in this case, it >>> matches "de" --> >>> >>> Or perhaps: >>> >>> <i18n:locale>en,zh-yue,ja-JP</i18n:locale> <!-- in this case you get >>> en-GB as the default --> >>> >>> And finally: >>> >>> <i18n:locale>fr-FR</i18n:locale> <!-- you get French locale >>> behavior, but probably en-GB messages; no "fr" is available --> >>> >>> >>> >>>>> What is missing in the current version is that we don't provide: >>>>> >>>>> - a way to enumerate the available items >>>>> - a way to specify the complete set of preferences >>>>> - a reference to RFC 4647 Lookup (that is, locale-based resource >>>>> >>>> negotiation) >>>> I agree. Again my understanding is that these are to be >>>> provided as >>>> a separate document or a future revision. >>>> >>> >>> Note that WS-I18N in its current incarnation is exactly the second >>> draft. W3C's first version (2005-09-14) was taken from a trial >>> balloon I wrote. At that time there was no Lookup algorithm, no LTLI >>> (okay, there still isn't an LTLI, but that's something to fix), not >>> much in the way of LDML, and RFC 4646 was still an Internet-Draft >>> (with several to follow). With these items available to us, we >>> should do the work to get WS-I18N right (it's actually a fairly >>> minor set of revisions required, IMO). >>> >> I thought that may be months of work. If a comprehensive solution >> could be included in the next version, that would be great. > > > I don't see a comprehensive solution yet, although there seems to be > some rough consensus coming up in this thread. So it's hard to make > time planning at the moment. > > Felix > >>> >>>>> I don't say that Unicode is forced upon people (although using >>>>> >>>> SOAP is mighty close to forcing UTF-8). What I'm saying is that, as >>>> a parameter, it usually doesn't make a lot of sense. The data often >>>> has to be transcoded for the benefit of (for example) the XML >>>> processor anyway. The fact that data exists as some legacy encoding >>>> affects the results or operation of the service itself (you still >>>> can't store Japanese character data in a WE8ISO8859P1 database even >>>> if the Web service layer permits you to send it some). But it's not >>>> necessarily something that one can usefully specify at the service >>>> layer. >>>> >>>>> Anyway, I don't want to sound completely absolutist here. I know >>>>> >>>> what kinds of cases you're thinking of and think they have merit. >>>> I do agree character set is generally not so useful as other >>>> elements >>>> and not encouraged to use. I just think a character set is >>>> considered as >>>> one of the elements of a locale and some people may find it useful >>>> if WS-I18N defines how to indicate it. >>>> >>> >>> Character set is considered one of the elements of *some* locale >>> systems. The question is: what does this parameter do or mean? If I >>> have a <i18n:charset>ISO8859_1</i18n:charset> in my service's >>> WS-Policy, does that mean I should transcode my SOAP request to >>> Latin-1? Am I limited to Latin-1 characters in my request? Will I >>> only receive Latin-1 characters in the response? The charset >>> limitation may occur on several different levels of the system or it >>> may simply be an assertion about the data. >>> >>> Since most developers wouldn't know what an encoding was if it grew >>> legs and bit them, that makes me wary. If nothing else, we need to >>> put a big Health Warning sticker on it :-). >>> >>> >>>> If a requester is only interested in getting responses in a >>>> specific >>>> native character set (e.g. the response will be processed in a >>>> component >>>> that can only process a native encoding, or it will be stored in a >>>> database that can only store a specific native character set), the >>>> service could filter the response based on this information. >>>> >>>> >>> >>> Degrading the data early is usually a bad option :-). Converting the >>> data from the UTF-8 used in the transport layer to the local >>> encoding is usually en effective enough filter---and YOUR code did >>> it, not my beautiful, pristine service <g>. This, for example, is >>> true when you find out that "ISO 8859-1" sometimes means >>> "windows-1252"... but sometimes it doesn't. >>> >>> Anyway, I digress. We can probably find a way to accommodate >>> 'charset'. All I'm saying is: how we do it is important. >>> >> All right, I think how i18n:charset can be useful needs to be >> examined and I am not sure whether it is truly valuable. Due to this >> lack of support, users who need to deal with a native encoding may >> decide to migrate to Unicode and that can be a good thing. :-) Please >> let me withdraw the idea of <charset> element so we can better focus >> on the other items. >> >> Regards, >> -Dan >>> Addison >>> >> >> >> > >
Received on Thursday, 19 June 2008 22:23:52 UTC