- From: Mark Davis <mark.davis@icu-project.org>
- Date: Thu, 27 Apr 2006 09:49:02 -0700
- To: Felix Sasaki <fsasaki@w3.org>
- CC: Addison Phillips <addison@yahoo-inc.com>, www-i18n-comments@w3.org, "public-i18n-core@w3.org" <public-i18n-core@w3.org>
Part of the problem is that the Java Locale is really misnamed (mea culpa!!). It, like the CLDR locale or the ICU locale, is really a language (with a bit of extra cruft since it wasn't clearly separated originally). And part of the issue is the "locale" means *such* different things to different people. If you define it as a set of preferences associated with a particularly user community, it is overly broad. If you narrow it to a given user community with a shared language and physical location, it gets narrower; but perhaps too narrow. But there is a broad range of interpretation. For a given company, the people in a given postal code might be the granularity they need; or a given tax region (eg city + county + state in the US); or a given timezone. But physical location is also too narrow, since what you might want is the set of users associated with a given policy (eg subject to US tax law) no matter where they are physically located. Felix Sasaki wrote: > cc'ing also to public-i18n-core, so that Mary can see the discussion, > > During the last i18n core call, Mary was surprised that Mark proposed > time zone as a case of a locale, since in Java it is separated from the > Locale class. Mary said also she would propose a different example, so I > would like to wait a bit for that. > > - Felix > > Mark Davis wrote: > >> I think we need to have a clear discussion about what constitutes a >> locale before progressing further. For my mind (language, timezone), >> such as (en_US, Etc/GMT) is one of the clearest cases of a locale, so I >> don't know what your mental image of a locale is. >> >> Addison Phillips wrote: >> >>> Hi folks! Nice to see this work progressing... >>> >>> --- >>> Section 1.1: The text describing locales is vague and/or possibly >>> sloppy. I think you would be better off being very clear the RFC >>> 3066/successor refers to language identification ONLY. Locales can be >>> inferred from language identifiers (i.e. Accept-Language) or use >>> identical tags in data items (elements, attributes, headers, etc.) >>> that serve only the purpose of locale identification. This will help >>> preserve (for example) clarity in specs such as XSL F&O where there >>> has never been a locale identifier... >>> >>> Section 1.2: eliminate comma from first sentence. >>> >>> Section 1.2: "However, such formats might apply the definitions made >>> in this specification, see e.g. [LDML]." This sentence is unclear. >>> Change to say: "One possible source of locale data and data formats is >>> [LDML]"?? >>> >>> Section 1.3: "Web Service Internationalization" should read "Web >>> services Internationalization" >>> >>> Section 1.3/1.4: Section 1.3 and Section 1.4 should be a single section. >>> >>> Section 2.2: This section mixes languages and locales as if they were >>> the same thing. I think this is dangerous. We spent a lot of time in >>> WSTF building text to deal with this in a purposeful way. Language >>> tags are for languages. Locales can be inferred from language tags >>> (the locale mechanism used inside your programming environment may use >>> very different identifiers, cf. LCIDs). Thus item (2) in the list is >>> wrong. >>> >>> Comment: I think you should import text (with minor editing) from Web >>> Services Usage Scenarios to describe languages and locales and only >>> then launch into values. In particular, I commend you to Section 3.1 >>> and Section 3.1.1 of >>> http://www.w3.org/TR/2004/NOTE-ws-i18n-scenarios-20040730 >>> Section 2.2: The following is correctly identified as a Bad Thing, but >>> I would suggest you remove it altogether because you suggest that it >>> is sometimes okay to infer this. This is just bad practice or an >>> application assumption ("default currency"). In fact, this is Section >>> I-018 of WSUS >>> (http://www.w3.org/TR/2004/NOTE-ws-i18n-scenarios-20040730/#S-018) >>> "Note that sometimes information is heuristically inferred from >>> language or locale identifiers. For example, software might infer that >>> if the locale is "fr-FR" that the user's preferred currency is EUR. >>> However, that is only a guess because that locale ID does not specify >>> the preferred currency. The user may actually be living in the UK, and >>> do most transactions in GBP" >>> >>> Section 2.2: Example 1: This is a bad example because time zone is >>> always orthogonal to locale (and language). If you're going to say >>> anything about time zones, you should probably require the use of >>> Olson identifiers in specifications (a subject beyond the scope of >>> this document??) >>> >>> Section 2.3: references are to RFC 3066bis? Should be to draft-matching. >>> >>> Section 3: Item 3: Specifications that define operations on language >>> values really should accept both basic and extended ranges. What's >>> important to specify is the matching scheme itself. >>> >>> Item 5: I don't like this item at all. If you want to use an IRI to >>> point to some "information item", fine: that's your own choice and >>> none of our business. But this requirement as written means nothing >>> and will only serve to confuse people. I think you'd be better off >>> sticking with saying something like "use the same format for locale >>> IDs as language tags". If someone can propose a workable IRI solution, >>> you can then incorporate that. The point (I think) is to avoid having >>> nine ways of identifying a locale. >>> >>> Editorial: In the note, this phrase "are conform to these criteria" >>> should say "conformant" >>> >>> General: I really think you should write about language identification >>> and then about inferring locale from it. In particular, I would >>> suggest that you consider adding something like these requirements: >>> >>> - Specifications MUST NOT use the xml:lang attribute to convey locale >>> information. // specs must not promote poor behavior. Xml:lang >>> identifies natural language usage in a document. >>> >>> - Specifications MUST define the default behavior for matching of >>> language content (see draft-matching, Section 3.4.1) >>> >>> - Specifications that use HTTP 1.1 SHOULD allow an application to >>> infer a user's locale preferences from the HTTP Accept-Language >>> header. // or something like this, eh? >>> >>> - Specifications that define the exchange of locale information MUST >>> define locale identifiers in terms of RFC 3066bis language tags and >>> MAY define specific extensions or private-use codes to identify >>> additional information. // this is the big one >>> >>> ---- >>> As always, my best regards, >>> >>> Addison >>> >>> Addison Phillips >>> Internationalization Architect - Yahoo! Inc. >>> >>> Internationalization is an architecture. >>> It is not a feature. >>> >>> >>> >>> >>> > > >
Received on Thursday, 27 April 2006 16:49:31 UTC