- From: Felix Sasaki <fsasaki@w3.org>
- Date: Tue, 7 Aug 2012 11:27:52 +0200
- To: "Phillips, Addison" <addison@lab126.com>, Mark Davis ☕ <mark@macchiato.com>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "www-international@w3.org" <www-international@w3.org>
- Message-ID: <CAL58czqfrrOmDrSWzykHd511ea6BNntx1+2RV-SedttXYJM-Ng@mail.gmail.com>
Hi Addison, all again, FYI, I had an action item to discuss extended filtering in the MLW-LT working group. It looks like we will have consensus on having extended filtering, see the latest edits at http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#LocaleFilter Best, Felix 2012/7/23 Felix Sasaki <fsasaki@w3.org> > Hi Addison, all, > > coming back to the "locale" definition in ITS 2.0 we had discussed a while > ago: Shaun McCane from the MLW-LT working group has created a locale filter > definition, see > > > http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#LocaleFilter > > I think this implements 1-4 below - do you want to have a look? > > Thanks, > > Felix > > 2012/6/26 Phillips, Addison <addison@lab126.com> > >> Hi Felix,**** >> >> ** ** >> >> You missed: “remove the hyphen-to-underscore conversion” :-). Otherwise, >> looks like what we’d suggested.**** >> >> ** ** >> >> Addison**** >> >> ** ** >> >> *From:* Felix Sasaki [mailto:fsasaki@w3.org] >> *Sent:* Tuesday, June 26, 2012 5:59 AM >> *To:* Mark Davis ☕; Phillips, Addison; public-multilingualweb-lt@w3.org; >> www-international@w3.org >> *Subject:* Re: BCP 47 "t" extension follow up and locale identifier >> definition**** >> >> ** ** >> >> (Apologies for cross-posting and thanks to Addison for pointing out >> www-international),**** >> >> ** ** >> >> Thanks for your feedback, Addison and Mark. To summarize the main points: >> **** >> >> ** ** >> >> 1) We use BCP 47 language tags in a dedicated piece of markup, e.g. **** >> >> <span its:filterLocale=”de-ch,fr-ch,it-ch”>Swiss legal notice, only to be >> taken into account for localization into a swiss locale</span>**** >> >> 2) We use komma as the delimiter instead of semicolon, see "span" element >> above**** >> >> 3) We need to make the relation to BCP 47 filtering clear. **** >> >> 4) We don't need text to point out the "u" extension - people may or may >> not use it, but if we go for BCP47 people can use any extension they want. >> **** >> >> 5) WRT to the tags that Mark mentioned in 1. below: are the "transform" >> XML files here >> http://unicode.org/cldr/trac/browser/tags/release-21-0-2/common/bcp47 the >> currently registered fields for transforms? **** >> >> ** ** >> >> Felix**** >> >> ** ** >> >> 2012/6/25 Mark Davis ☕ <mark@macchiato.com>**** >> >> > Since the "t" extension is also meant to express process related >> information, we want to coordinate the values that can be used via that >> extension with what we define - or just refer to them. What would be the >> best way to achieve this?**** >> >> There are two possible ways that work work.**** >> >> 1. Reference LDML for the tags, and propose registrations for any >> additional ones you need.**** >> 2. Do #1, but because you have a separate field (that doesn't have to >> be a BCP47 tag), you can reserve strings that could not be BCP47 subtags >> for your own use.**** >> >> We do something similar for short TZ identifiers. We use UN LOCODE codes >> where they exist; where they don't, we use codes that are longer or shorter >> so that they will not collide with future UN LOCODE codes.**** >> >> ** ** >> >> > locales...**** >> >> ** ** >> >> I agree with Addison on all of the locale issues. >> **** >> >> ** ** >> ------------------------------ >> >> Mark <https://plus.google.com/114199149796022210033>**** >> >> ** ** >> >> *— Il meglio è l’inimico del bene —***** >> >> >> >> **** >> >> On Mon, Jun 25, 2012 at 12:06 PM, Phillips, Addison <addison@lab126.com> >> wrote:**** >> >> I have a number of thoughts about the locales question. I have not talked >> to Mark about this and he may not agree with any or all of the below.**** >> >> **** >> >> It would be more useful, in my opinion, to define section 5.1.3 as a BCP >> 47 language priority list (with language tags between the separators). I >> would tend to prefer commas to semi-colons (since these are more common in >> HTML and in headers, etc.). This “filter” isn’t quite the same thing as BCP >> 47 “filtering” matching schemes (or is it?) and that probably should be >> highlighted.**** >> >> **** >> >> The link to the locale ID section didn’t work, but I found it searching >> the document. I was dismayed to see the underscore conversion. What purpose >> does it serve? I’ve found that using language tags with no >> hyphen/underscore mapping makes for a cleaner, less complex implementation. >> For one thing, a common case is likely to be a mapping directly between the >> two. Inserting a transformation adds needless complexity at the markup >> level. [An implementation can internally map it, if necessary.]**** >> >> **** >> >> I don’t particularly care for the somewhat artificial distinction between >> a language tag and a locale identifier in the document. BCP 47 makes having >> such a separation much less relevant. That is, “de-DE” is a perfectly >> useful locale identifier—and it’s a valid language tag as well. The “u” >> extension doesn’t ruin this relationship: “de-DE-u-co-phonebk” is also a >> valid language tag (besides being useful as a locale identifier). The extra >> subtags may be ignorable in a translation process, but this doesn’t ruin a >> locale identifier’s utility as a language tag. Where I encounter the most >> issues tends to be when mapping must be done between the two concepts >> instead of tags being useful in both contexts.**** >> >> **** >> >> I do recognize that you need a separate **field** for “locale” (how >> language materials are packaged/delivered) from the source or target >> language of the content in ITS. But I think that the identifiers themselves >> should not be different from one another. For example, I can see something >> like the following:**** >> >> **** >> >> <someElement xml:lang=”zh-Hans” its:filterLocale=”zh-CN”>中文 >> </someElement>**** >> >> **** >> >> Finally, you go out of your way to say:**** >> >> **** >> >> --**** >> >> Implementations of ITS 2.0 are not expected to process the "u" extension >> for further locale information as defined in RFC 6067<http://tools.ietf.org/html/rfc6067> >> .**** >> >> --**** >> >> **** >> >> I think you should reconsider this text: it’s not normative but might be >> read as a normative direction, and implementations of ITS 2.0 might need to >> interact with the “u” extension: Java 7, for example, has several built-in >> locales that make use of the extension. I think what you mean is that the >> extension is ignored in language fields (such as sourceLanguage, etc.)??* >> *** >> >> **** >> >> <chair hat=”on”> btw, you should use the www-international@ list instead >> of our public WG list going forwards. I have moved public-i18n-core@ to >> bcc: and copied the winter list for you :-)**** >> >> **** >> >> Addison**** >> >> **** >> >> Addison Phillips**** >> >> Globalization Architect (Lab126)**** >> >> Chair (W3C I18N WG)**** >> >> **** >> >> Internationalization is not a feature.**** >> >> It is an architecture.**** >> >> **** >> >> **** >> >> **** >> >> **** >> >> **** >> >> *From:* Felix Sasaki [mailto:fsasaki@w3.org] >> *Sent:* Monday, June 25, 2012 5:53 AM >> *To:* Mark Davis >> *Cc:* public-multilingualweb-lt@w3.org; public-i18n-core@w3.org >> *Subject:* BCP 47 "t" extension follow up and locale identifier >> definition**** >> >> **** >> >> Dear Mark, with CC to the MultilingualWeb LT and the i18n core public >> list,**** >> >> **** >> >> I have an action ACTION-133 to follow up on the BCP 47 discussion we had >> with your contribution on 12 June. Thanks again for your presentation.*** >> * >> >> **** >> >> In our requirements document **** >> >> http://www.w3.org/TR/2012/WD-its2req-20120524/**** >> >> we have several requirements related to processes, see e.g.**** >> >> http://www.w3.org/TR/2012/WD-its2req-20120524/#Process_Model >> **** >> >> **** >> >> Since the "t" extension is also meant to express process related >> information, we want to coordinate the values that can be used via that >> extension with what we define - or just refer to them. What would be the >> best way to achieve this?**** >> >> **** >> >> A related issue: we need to specify information about locales, see e.g.** >> ** >> >> http://www.w3.org/TR/2012/WD-its2req-20120524/#locale-filter**** >> >> The current thinking about locale identifiers is here**** >> >> >> http://www.w3.org/TR/2012/WD-its2req-20120524/#Identification_of_Language_and_Local >> **** >> >> At the Dublin workshop there was already some feedback from Richard >> (IIRC): if we have a dedicated field for a local identifier, than a basic >> BCP 47 language tag (without the underscore conversion) might do it. Do you >> have any thoughts on this?**** >> >> >> Thanks a lot for your feedback in advance,**** >> >> **** >> >> Felix**** >> >> **** >> >> -- >> Felix Sasaki**** >> >> DFKI / W3C Fellow**** >> >> **** >> >> ** ** >> >> >> >> **** >> >> ** ** >> >> -- >> Felix Sasaki**** >> >> DFKI / W3C Fellow**** >> >> ** ** >> > > > > -- > Felix Sasaki > DFKI / W3C Fellow > > -- Felix Sasaki DFKI / W3C Fellow
Received on Tuesday, 7 August 2012 09:28:29 UTC