- From: Felix Sasaki <fsasaki@w3.org>
- Date: Mon, 23 Jul 2012 17:10:07 +0200
- To: "Phillips, Addison" <addison@lab126.com>
- Cc: Mark Davis ☕ <mark@macchiato.com>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "www-international@w3.org" <www-international@w3.org>
- Message-ID: <CAL58czqSLPir0VK+BUrq-Q6Cai77g9HSLhabhpTZeKeHDD6DkA@mail.gmail.com>
Hi Addison, all, coming back to the "locale" definition in ITS 2.0 we had discussed a while ago: Shaun McCane from the MLW-LT working group has created a locale filter definition, see http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#LocaleFilter I think this implements 1-4 below - do you want to have a look? Thanks, Felix 2012/6/26 Phillips, Addison <addison@lab126.com> > Hi Felix,**** > > ** ** > > You missed: “remove the hyphen-to-underscore conversion” :-). Otherwise, > looks like what we’d suggested.**** > > ** ** > > Addison**** > > ** ** > > *From:* Felix Sasaki [mailto:fsasaki@w3.org] > *Sent:* Tuesday, June 26, 2012 5:59 AM > *To:* Mark Davis ☕; Phillips, Addison; public-multilingualweb-lt@w3.org; > www-international@w3.org > *Subject:* Re: BCP 47 "t" extension follow up and locale identifier > definition**** > > ** ** > > (Apologies for cross-posting and thanks to Addison for pointing out > www-international),**** > > ** ** > > Thanks for your feedback, Addison and Mark. To summarize the main points:* > *** > > ** ** > > 1) We use BCP 47 language tags in a dedicated piece of markup, e.g. **** > > <span its:filterLocale=”de-ch,fr-ch,it-ch”>Swiss legal notice, only to be > taken into account for localization into a swiss locale</span>**** > > 2) We use komma as the delimiter instead of semicolon, see "span" element > above**** > > 3) We need to make the relation to BCP 47 filtering clear. **** > > 4) We don't need text to point out the "u" extension - people may or may > not use it, but if we go for BCP47 people can use any extension they want. > **** > > 5) WRT to the tags that Mark mentioned in 1. below: are the "transform" > XML files here > http://unicode.org/cldr/trac/browser/tags/release-21-0-2/common/bcp47 the > currently registered fields for transforms? **** > > ** ** > > Felix**** > > ** ** > > 2012/6/25 Mark Davis ☕ <mark@macchiato.com>**** > > > Since the "t" extension is also meant to express process related > information, we want to coordinate the values that can be used via that > extension with what we define - or just refer to them. What would be the > best way to achieve this?**** > > There are two possible ways that work work.**** > > 1. Reference LDML for the tags, and propose registrations for any > additional ones you need.**** > 2. Do #1, but because you have a separate field (that doesn't have to > be a BCP47 tag), you can reserve strings that could not be BCP47 subtags > for your own use.**** > > We do something similar for short TZ identifiers. We use UN LOCODE codes > where they exist; where they don't, we use codes that are longer or shorter > so that they will not collide with future UN LOCODE codes.**** > > ** ** > > > locales...**** > > ** ** > > I agree with Addison on all of the locale issues. > **** > > ** ** > ------------------------------ > > Mark <https://plus.google.com/114199149796022210033>**** > > ** ** > > *— Il meglio è l’inimico del bene —***** > > > > **** > > On Mon, Jun 25, 2012 at 12:06 PM, Phillips, Addison <addison@lab126.com> > wrote:**** > > I have a number of thoughts about the locales question. I have not talked > to Mark about this and he may not agree with any or all of the below.**** > > **** > > It would be more useful, in my opinion, to define section 5.1.3 as a BCP > 47 language priority list (with language tags between the separators). I > would tend to prefer commas to semi-colons (since these are more common in > HTML and in headers, etc.). This “filter” isn’t quite the same thing as BCP > 47 “filtering” matching schemes (or is it?) and that probably should be > highlighted.**** > > **** > > The link to the locale ID section didn’t work, but I found it searching > the document. I was dismayed to see the underscore conversion. What purpose > does it serve? I’ve found that using language tags with no > hyphen/underscore mapping makes for a cleaner, less complex implementation. > For one thing, a common case is likely to be a mapping directly between the > two. Inserting a transformation adds needless complexity at the markup > level. [An implementation can internally map it, if necessary.]**** > > **** > > I don’t particularly care for the somewhat artificial distinction between > a language tag and a locale identifier in the document. BCP 47 makes having > such a separation much less relevant. That is, “de-DE” is a perfectly > useful locale identifier—and it’s a valid language tag as well. The “u” > extension doesn’t ruin this relationship: “de-DE-u-co-phonebk” is also a > valid language tag (besides being useful as a locale identifier). The extra > subtags may be ignorable in a translation process, but this doesn’t ruin a > locale identifier’s utility as a language tag. Where I encounter the most > issues tends to be when mapping must be done between the two concepts > instead of tags being useful in both contexts.**** > > **** > > I do recognize that you need a separate **field** for “locale” (how > language materials are packaged/delivered) from the source or target > language of the content in ITS. But I think that the identifiers themselves > should not be different from one another. For example, I can see something > like the following:**** > > **** > > <someElement xml:lang=”zh-Hans” its:filterLocale=”zh-CN”>中文 > </someElement>**** > > **** > > Finally, you go out of your way to say:**** > > **** > > --**** > > Implementations of ITS 2.0 are not expected to process the "u" extension > for further locale information as defined in RFC 6067<http://tools.ietf.org/html/rfc6067> > .**** > > --**** > > **** > > I think you should reconsider this text: it’s not normative but might be > read as a normative direction, and implementations of ITS 2.0 might need to > interact with the “u” extension: Java 7, for example, has several built-in > locales that make use of the extension. I think what you mean is that the > extension is ignored in language fields (such as sourceLanguage, etc.)??** > ** > > **** > > <chair hat=”on”> btw, you should use the www-international@ list instead > of our public WG list going forwards. I have moved public-i18n-core@ to > bcc: and copied the winter list for you :-)**** > > **** > > Addison**** > > **** > > Addison Phillips**** > > Globalization Architect (Lab126)**** > > Chair (W3C I18N WG)**** > > **** > > Internationalization is not a feature.**** > > It is an architecture.**** > > **** > > **** > > **** > > **** > > **** > > *From:* Felix Sasaki [mailto:fsasaki@w3.org] > *Sent:* Monday, June 25, 2012 5:53 AM > *To:* Mark Davis > *Cc:* public-multilingualweb-lt@w3.org; public-i18n-core@w3.org > *Subject:* BCP 47 "t" extension follow up and locale identifier definition > **** > > **** > > Dear Mark, with CC to the MultilingualWeb LT and the i18n core public list, > **** > > **** > > I have an action ACTION-133 to follow up on the BCP 47 discussion we had > with your contribution on 12 June. Thanks again for your presentation.**** > > **** > > In our requirements document **** > > http://www.w3.org/TR/2012/WD-its2req-20120524/**** > > we have several requirements related to processes, see e.g.**** > > http://www.w3.org/TR/2012/WD-its2req-20120524/#Process_Model > **** > > **** > > Since the "t" extension is also meant to express process related > information, we want to coordinate the values that can be used via that > extension with what we define - or just refer to them. What would be the > best way to achieve this?**** > > **** > > A related issue: we need to specify information about locales, see e.g.*** > * > > http://www.w3.org/TR/2012/WD-its2req-20120524/#locale-filter**** > > The current thinking about locale identifiers is here**** > > > http://www.w3.org/TR/2012/WD-its2req-20120524/#Identification_of_Language_and_Local > **** > > At the Dublin workshop there was already some feedback from Richard > (IIRC): if we have a dedicated field for a local identifier, than a basic > BCP 47 language tag (without the underscore conversion) might do it. Do you > have any thoughts on this?**** > > > Thanks a lot for your feedback in advance,**** > > **** > > Felix**** > > **** > > -- > Felix Sasaki**** > > DFKI / W3C Fellow**** > > **** > > ** ** > > > > **** > > ** ** > > -- > Felix Sasaki**** > > DFKI / W3C Fellow**** > > ** ** > -- Felix Sasaki DFKI / W3C Fellow
Received on Monday, 23 July 2012 15:10:46 UTC