Re: Comment on LTLI WD from Mark Davis on 2006-04-27 (www-i18n-comments@w3.org from April 2006)

From: Mark Davis <mark.davis@icu-project.org>
Date: Wed, 26 Apr 2006 17:41:35 -0700
To: Addison Phillips <addison@yahoo-inc.com>
CC: www-i18n-comments@w3.org
Message-ID: <4450133F.30507@icu-project.org>
I think we need to have a clear discussion about what constitutes a 
locale before progressing further. For my mind (language, timezone), 
such as (en_US, Etc/GMT) is one of the clearest cases of a locale, so I 
don't know what your mental image of a locale is.

Addison Phillips wrote:
> Hi folks! Nice to see this work progressing...
>
> ---
> Section 1.1: The text describing locales is vague and/or possibly sloppy. I think you would be better off being very clear the RFC 3066/successor refers to language identification ONLY. Locales can be inferred from language identifiers (i.e. Accept-Language) or use identical tags in data items (elements, attributes, headers, etc.) that serve only the purpose of locale identification. This will help preserve (for example) clarity in specs such as XSL F&O where there has never been a locale identifier...
>
> Section 1.2: eliminate comma from first sentence.
>
> Section 1.2: "However, such formats might apply the definitions made in this specification, see e.g. [LDML]." This sentence is unclear. Change to say: "One possible source of locale data and data formats is [LDML]"??
>
> Section 1.3: "Web Service Internationalization" should read "Web services Internationalization"
>
> Section 1.3/1.4: Section 1.3 and Section 1.4 should be a single section.
>
> Section 2.2: This section mixes languages and locales as if they were the same thing. I think this is dangerous. We spent a lot of time in WSTF building text to deal with this in a purposeful way. Language tags are for languages. Locales can be inferred from language tags (the locale mechanism used inside your programming environment may use very different identifiers, cf. LCIDs). Thus item (2) in the list is wrong.
>
> Comment: I think you should import text (with minor editing) from Web Services Usage Scenarios to describe languages and locales and only then launch into values. In particular, I commend you to Section 3.1 and Section 3.1.1 of http://www.w3.org/TR/2004/NOTE-ws-i18n-scenarios-20040730 
>
> Section 2.2: The following is correctly identified as a Bad Thing, but I would suggest you remove it altogether because you suggest that it is sometimes okay to infer this. This is just bad practice or an application assumption ("default currency"). In fact, this is Section I-018 of WSUS (http://www.w3.org/TR/2004/NOTE-ws-i18n-scenarios-20040730/#S-018) 
>
> "Note that sometimes information is heuristically inferred from language or locale identifiers. For example, software might infer that if the locale is "fr-FR" that the user's preferred currency is EUR. However, that is only a guess because that locale ID does not specify the preferred currency. The user may actually be living in the UK, and do most transactions in GBP"
>
> Section 2.2: Example 1: This is a bad example because time zone is always orthogonal to locale (and language). If you're going to say anything about time zones, you should probably require the use of Olson identifiers in specifications (a subject beyond the scope of this document??)
>
> Section 2.3: references are to RFC 3066bis? Should be to draft-matching.
>
> Section 3: Item 3: Specifications that define operations on language values really should accept both basic and extended ranges. What's important to specify is the matching scheme itself.
>
> Item 5: I don't like this item at all. If you want to use an IRI to point to some "information item", fine: that's your own choice and none of our business. But this requirement as written means nothing and will only serve to confuse people. I think you'd be better off sticking with saying something like "use the same format for locale IDs as language tags". If someone can propose a workable IRI solution, you can then incorporate that. The point (I think) is to avoid having nine ways of identifying a locale.
>
> Editorial: In the note, this phrase "are conform to these criteria" should say "conformant"
>
> General: I really think you should write about language identification and then about inferring locale from it. In particular, I would suggest that you consider adding something like these requirements:
>
> - Specifications MUST NOT use the xml:lang attribute to convey locale information. // specs must not promote poor behavior. Xml:lang identifies natural language usage in a document.
>
> - Specifications MUST define the default behavior for matching of language content (see draft-matching, Section 3.4.1)
>
> - Specifications that use HTTP 1.1 SHOULD allow an application to infer a user's locale preferences from the HTTP Accept-Language header. // or something like this, eh?
>
> - Specifications that define the exchange of locale information MUST define locale identifiers in terms of RFC 3066bis language tags and MAY define specific extensions or private-use codes to identify additional information. // this is the big one
>
> ----
> As always, my best regards,
>
> Addison
>
> Addison Phillips
> Internationalization Architect - Yahoo! Inc.
>
> Internationalization is an architecture.
> It is not a feature. 
>
>
>
>
>
Received on Thursday, 27 April 2006 00:48:44 UTC