Re: Comment on LTLI WD from Felix Sasaki on 2006-04-27 (public-i18n-core@w3.org from April to June 2006)

From: Felix Sasaki <fsasaki@w3.org>
Date: Thu, 27 Apr 2006 10:10:13 +0900
To: Mark Davis <mark.davis@icu-project.org>
Cc: Addison Phillips <addison@yahoo-inc.com>, www-i18n-comments@w3.org, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Message-ID: <445019F5.4070602@w3.org>
cc'ing also to public-i18n-core, so that Mary can see the discussion,

During the last i18n core call, Mary was surprised that Mark proposed
time zone as a case of a locale, since in Java it is separated from the
Locale class. Mary said also she would propose a different example, so I
would like to wait a bit for that.

- Felix

Mark Davis wrote:
> 
> I think we need to have a clear discussion about what constitutes a
> locale before progressing further. For my mind (language, timezone),
> such as (en_US, Etc/GMT) is one of the clearest cases of a locale, so I
> don't know what your mental image of a locale is.
> 
> Addison Phillips wrote:
>> Hi folks! Nice to see this work progressing...
>>
>> ---
>> Section 1.1: The text describing locales is vague and/or possibly
>> sloppy. I think you would be better off being very clear the RFC
>> 3066/successor refers to language identification ONLY. Locales can be
>> inferred from language identifiers (i.e. Accept-Language) or use
>> identical tags in data items (elements, attributes, headers, etc.)
>> that serve only the purpose of locale identification. This will help
>> preserve (for example) clarity in specs such as XSL F&O where there
>> has never been a locale identifier...
>>
>> Section 1.2: eliminate comma from first sentence.
>>
>> Section 1.2: "However, such formats might apply the definitions made
>> in this specification, see e.g. [LDML]." This sentence is unclear.
>> Change to say: "One possible source of locale data and data formats is
>> [LDML]"??
>>
>> Section 1.3: "Web Service Internationalization" should read "Web
>> services Internationalization"
>>
>> Section 1.3/1.4: Section 1.3 and Section 1.4 should be a single section.
>>
>> Section 2.2: This section mixes languages and locales as if they were
>> the same thing. I think this is dangerous. We spent a lot of time in
>> WSTF building text to deal with this in a purposeful way. Language
>> tags are for languages. Locales can be inferred from language tags
>> (the locale mechanism used inside your programming environment may use
>> very different identifiers, cf. LCIDs). Thus item (2) in the list is
>> wrong.
>>
>> Comment: I think you should import text (with minor editing) from Web
>> Services Usage Scenarios to describe languages and locales and only
>> then launch into values. In particular, I commend you to Section 3.1
>> and Section 3.1.1 of
>> http://www.w3.org/TR/2004/NOTE-ws-i18n-scenarios-20040730
>> Section 2.2: The following is correctly identified as a Bad Thing, but
>> I would suggest you remove it altogether because you suggest that it
>> is sometimes okay to infer this. This is just bad practice or an
>> application assumption ("default currency"). In fact, this is Section
>> I-018 of WSUS
>> (http://www.w3.org/TR/2004/NOTE-ws-i18n-scenarios-20040730/#S-018)
>> "Note that sometimes information is heuristically inferred from
>> language or locale identifiers. For example, software might infer that
>> if the locale is "fr-FR" that the user's preferred currency is EUR.
>> However, that is only a guess because that locale ID does not specify
>> the preferred currency. The user may actually be living in the UK, and
>> do most transactions in GBP"
>>
>> Section 2.2: Example 1: This is a bad example because time zone is
>> always orthogonal to locale (and language). If you're going to say
>> anything about time zones, you should probably require the use of
>> Olson identifiers in specifications (a subject beyond the scope of
>> this document??)
>>
>> Section 2.3: references are to RFC 3066bis? Should be to draft-matching.
>>
>> Section 3: Item 3: Specifications that define operations on language
>> values really should accept both basic and extended ranges. What's
>> important to specify is the matching scheme itself.
>>
>> Item 5: I don't like this item at all. If you want to use an IRI to
>> point to some "information item", fine: that's your own choice and
>> none of our business. But this requirement as written means nothing
>> and will only serve to confuse people. I think you'd be better off
>> sticking with saying something like "use the same format for locale
>> IDs as language tags". If someone can propose a workable IRI solution,
>> you can then incorporate that. The point (I think) is to avoid having
>> nine ways of identifying a locale.
>>
>> Editorial: In the note, this phrase "are conform to these criteria"
>> should say "conformant"
>>
>> General: I really think you should write about language identification
>> and then about inferring locale from it. In particular, I would
>> suggest that you consider adding something like these requirements:
>>
>> - Specifications MUST NOT use the xml:lang attribute to convey locale
>> information. // specs must not promote poor behavior. Xml:lang
>> identifies natural language usage in a document.
>>
>> - Specifications MUST define the default behavior for matching of
>> language content (see draft-matching, Section 3.4.1)
>>
>> - Specifications that use HTTP 1.1 SHOULD allow an application to
>> infer a user's locale preferences from the HTTP Accept-Language
>> header. // or something like this, eh?
>>
>> - Specifications that define the exchange of locale information MUST
>> define locale identifiers in terms of RFC 3066bis language tags and
>> MAY define specific extensions or private-use codes to identify
>> additional information. // this is the big one
>>
>> ----
>> As always, my best regards,
>>
>> Addison
>>
>> Addison Phillips
>> Internationalization Architect - Yahoo! Inc.
>>
>> Internationalization is an architecture.
>> It is not a feature.
>>
>>
>>
>>   
>
Received on Thursday, 27 April 2006 01:10:58 UTC