- From: r12a via GitHub <sysbot+gh@w3.org>
- Date: Wed, 30 Sep 2020 13:40:48 +0000
- To: public-i18n-archive@w3.org
r12a has just created a new issue for https://github.com/w3c/ltli: == Definition of Unicode Locale == https://w3c.github.io/ltli/#ref-for-dfn-unicode-locale-1 > Unicode Locale Identifier or Unicode Locale. A language tag that follows the additional processing rules defined by [CLDR] in UTR#35 [LDML]. A Unicode Locale can include a combination of certain language tag extensions ([RFC6067], [RFC6497]), although it is not required to do so. > > A Unicode locale provides the ability to specify in a language tag some international preference variations that go beyond linguistic or regional variation or to select formatting behavior or content when there are multiple options or user preferences within a given locale. Unicode locale identifiers are well-formed [BCP47] language tags. [CLDR] also specifies some additional rules about the structure and content of the Unicode Locale's language tag as well as supplying specific interpretation of certain subtags. See Section 3.2 of [LDML] for details. > > Unicode's [CLDR] project maintains both of the [BCP47] extensions related to Unicode locales. The Unicode locale language tag extension [RFC6067] uses the -u- subtag, and provides subtags for selecting different locale-based formats and behaviors. See Section 3.6 of [LDML] for details. > > The Transformed Content extension [RFC6497], which uses the -t- subtag, provides subtags for text transformations, such as transliteration between scripts. See Section 3.7 of [LDML] for details. > > Unicode Locales increasingly form the basis for internationalization on the Web, particularly as part of the Intl locale framework [ECMA-402] in JavaScript [ECMASCRIPT]. This appears to me to say that language tags without -u or -t extensions are not Unicode Locale identifiers, and therefore not suitable for locale identification. It then goes on to say that > Content authors SHOULD choose language tags that are canonical Unicode locale identifiers. and > Implementations SHOULD only emit language tags that are canonical Unicode locale identifiers ... Which to me implies that any time anyone uses a language tag, it should include -u and/or -t tags. Which doesn't sound right. It seems to me that the definition is problematic, and could be changed to say one of the following (i'm being deliberately open here to possibilities): 1. A language tag that follows the LDML rules and includes a canonical unicode locale identifier (ie. -u or -t) can be referred to as a unicode locale identifier. 2. The part of a language tag that includes -u and/or -t and the subtags that follow them is referred to as a unicode locale identifier. Whichever is chosen, i think the mustard needs to be crafted way that is a little more subtle. Btw, it may be useful to briefly expand on "the additional processing rules defined by [CLDR] in UTR#35 [LDML]." Please view or discuss this issue at https://github.com/w3c/ltli/issues/25 using your GitHub account -- Sent via github-notify-ml as configured in https://github.com/w3c/github-notify-ml-config
Received on Wednesday, 30 September 2020 13:40:50 UTC