W3C home > Mailing lists > Public > www-international@w3.org > July to September 2012

ITS 2.0 LocaleFilter definition (Re: [Moderator Action] RE: BCP 47 "t" extension follow up and locale identifier definition)

From: Felix Sasaki <fsasaki@w3.org>
Date: Mon, 23 Jul 2012 17:10:07 +0200
Message-ID: <CAL58czqSLPir0VK+BUrq-Q6Cai77g9HSLhabhpTZeKeHDD6DkA@mail.gmail.com>
To: "Phillips, Addison" <addison@lab126.com>
Cc: Mark Davis ☕ <mark@macchiato.com>, "public-multilingualweb-lt@w3.org" <public-multilingualweb-lt@w3.org>, "www-international@w3.org" <www-international@w3.org>
Hi Addison, all,

coming back to the "locale" definition in ITS 2.0 we had discussed a while
ago: Shaun McCane from the MLW-LT working group has created a locale filter
definition, see

http://www.w3.org/International/multilingualweb/lt/drafts/its20/its20.html#LocaleFilter

I think this implements 1-4 below - do you want to have a look?

Thanks,

Felix

2012/6/26 Phillips, Addison <addison@lab126.com>

> Hi Felix,****
>
> ** **
>
> You missed: “remove the hyphen-to-underscore conversion” :-). Otherwise,
> looks like what we’d suggested.****
>
> ** **
>
> Addison****
>
> ** **
>
> *From:* Felix Sasaki [mailto:fsasaki@w3.org]
> *Sent:* Tuesday, June 26, 2012 5:59 AM
> *To:* Mark Davis ☕; Phillips, Addison; public-multilingualweb-lt@w3.org;
> www-international@w3.org
> *Subject:* Re: BCP 47 "t" extension follow up and locale identifier
> definition****
>
> ** **
>
> (Apologies for cross-posting and thanks to Addison for pointing out
> www-international),****
>
> ** **
>
> Thanks for your feedback, Addison and Mark. To summarize the main points:*
> ***
>
> ** **
>
> 1) We use BCP 47 language tags in a dedicated piece of markup, e.g. ****
>
> <span its:filterLocale=”de-ch,fr-ch,it-ch”>Swiss legal notice, only to be
> taken into account for localization into a swiss locale</span>****
>
> 2) We use komma as the delimiter instead of semicolon, see "span" element
> above****
>
> 3) We need to make the relation to BCP 47 filtering clear. ****
>
> 4) We don't need text to point out the "u" extension - people may or may
> not use it, but if we go for BCP47 people can use any extension they want.
> ****
>
> 5) WRT to the tags that Mark mentioned in 1. below: are the "transform"
> XML files here
> http://unicode.org/cldr/trac/browser/tags/release-21-0-2/common/bcp47 the
> currently registered fields for transforms? ****
>
> ** **
>
> Felix****
>
> ** **
>
> 2012/6/25 Mark Davis ☕ <mark@macchiato.com>****
>
> > Since the "t" extension is also meant to express process related
> information, we want to coordinate the values that can be used via that
> extension with what we define - or just refer to them. What would be the
> best way to achieve this?****
>
> There are two possible ways that work work.****
>
>    1. Reference LDML for the tags, and propose registrations for any
>    additional ones you need.****
>    2. Do #1, but because you have a separate field (that doesn't have to
>    be a BCP47 tag), you can reserve strings that could not be BCP47 subtags
>    for your own use.****
>
> We do something similar for short TZ identifiers. We use UN LOCODE codes
> where they exist; where they don't, we use codes that are longer or shorter
> so that they will not collide with future UN LOCODE codes.****
>
> ** **
>
> > locales...****
>
> ** **
>
> I agree with Addison on all of the locale issues.
> ****
>
> ** **
> ------------------------------
>
> Mark <https://plus.google.com/114199149796022210033>****
>
> ** **
>
> *— Il meglio è l’inimico del bene —*****
>
>
>
> ****
>
> On Mon, Jun 25, 2012 at 12:06 PM, Phillips, Addison <addison@lab126.com>
> wrote:****
>
> I have a number of thoughts about the locales question. I have not talked
> to Mark about this and he may not agree with any or all of the below.****
>
>  ****
>
> It would be more useful, in my opinion, to define section 5.1.3 as a BCP
> 47 language priority list (with language tags between the separators). I
> would tend to prefer commas to semi-colons (since these are more common in
> HTML and in headers, etc.). This “filter” isn’t quite the same thing as BCP
> 47 “filtering” matching schemes (or is it?) and that probably should be
> highlighted.****
>
>  ****
>
> The link to the locale ID section didn’t work, but I found it searching
> the document. I was dismayed to see the underscore conversion. What purpose
> does it serve? I’ve found that using language tags with no
> hyphen/underscore mapping makes for a cleaner, less complex implementation.
> For one thing, a common case is likely to be a mapping directly between the
> two. Inserting a transformation adds needless complexity at the markup
> level. [An implementation can internally map it, if necessary.]****
>
>  ****
>
> I don’t particularly care for the somewhat artificial distinction between
> a language tag and a locale identifier in the document. BCP 47 makes having
> such a separation much less relevant. That is, “de-DE” is a perfectly
> useful locale identifier—and it’s a valid language tag as well. The “u”
> extension doesn’t ruin this relationship: “de-DE-u-co-phonebk” is also a
> valid language tag (besides being useful as a locale identifier). The extra
> subtags may be ignorable in a translation process, but this doesn’t ruin a
> locale identifier’s utility as a language tag. Where I encounter the most
> issues tends to be when mapping must be done between the two concepts
> instead of tags being useful in both contexts.****
>
>  ****
>
> I do recognize that you need a separate **field** for “locale” (how
> language materials are packaged/delivered) from the source or target
> language of the content in ITS. But I think that the identifiers themselves
> should not be different from one another. For example, I can see something
> like the following:****
>
>  ****
>
>    <someElement xml:lang=”zh-Hans” its:filterLocale=”zh-CN”>中文
> </someElement>****
>
>  ****
>
> Finally, you go out of your way to say:****
>
>  ****
>
> --****
>
> Implementations of ITS 2.0 are not expected to process the "u" extension
> for further locale information as defined in RFC 6067<http://tools.ietf.org/html/rfc6067>
> .****
>
> --****
>
>  ****
>
> I think you should reconsider this text: it’s not normative but might be
> read as a normative direction, and implementations of ITS 2.0 might need to
> interact with the “u” extension: Java 7, for example, has several built-in
> locales that make use of the extension. I think what you mean is that the
> extension is ignored in language fields (such as sourceLanguage, etc.)??**
> **
>
>  ****
>
> <chair hat=”on”> btw, you should use the www-international@ list instead
> of our public WG list going forwards. I have moved public-i18n-core@ to
> bcc: and copied the winter list for you :-)****
>
>  ****
>
> Addison****
>
>  ****
>
> Addison Phillips****
>
> Globalization Architect (Lab126)****
>
> Chair (W3C I18N WG)****
>
>  ****
>
> Internationalization is not a feature.****
>
> It is an architecture.****
>
>  ****
>
>  ****
>
>  ****
>
>  ****
>
>  ****
>
> *From:* Felix Sasaki [mailto:fsasaki@w3.org]
> *Sent:* Monday, June 25, 2012 5:53 AM
> *To:* Mark Davis
> *Cc:* public-multilingualweb-lt@w3.org; public-i18n-core@w3.org
> *Subject:* BCP 47 "t" extension follow up and locale identifier definition
> ****
>
>  ****
>
> Dear Mark, with CC to the MultilingualWeb LT and the i18n core public list,
> ****
>
>  ****
>
> I have an action ACTION-133 to follow up on the BCP 47 discussion we had
> with your contribution on 12 June. Thanks again for your presentation.****
>
>  ****
>
> In our requirements document ****
>
> http://www.w3.org/TR/2012/WD-its2req-20120524/****
>
> we have several requirements related to processes, see e.g.****
>
> http://www.w3.org/TR/2012/WD-its2req-20120524/#Process_Model
> ****
>
>  ****
>
> Since the "t" extension is also meant to express process related
> information, we want to coordinate the values that can be used via that
> extension with what we define - or just refer to them. What would be the
> best way to achieve this?****
>
>  ****
>
> A related issue: we need to specify information about locales, see e.g.***
> *
>
> http://www.w3.org/TR/2012/WD-its2req-20120524/#locale-filter****
>
> The current thinking about locale identifiers is here****
>
>
> http://www.w3.org/TR/2012/WD-its2req-20120524/#Identification_of_Language_and_Local
> ****
>
> At the Dublin workshop there was already some feedback from Richard
> (IIRC): if we have a dedicated field for a local identifier, than a basic
> BCP 47 language tag (without the underscore conversion) might do it. Do you
> have any thoughts on this?****
>
>
> Thanks a lot for your feedback in advance,****
>
>  ****
>
> Felix****
>
>  ****
>
> --
> Felix Sasaki****
>
> DFKI / W3C Fellow****
>
>  ****
>
> ** **
>
>
>
> ****
>
> ** **
>
> --
> Felix Sasaki****
>
> DFKI / W3C Fellow****
>
> ** **
>



-- 
Felix Sasaki
DFKI / W3C Fellow
Received on Monday, 23 July 2012 15:10:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 23 July 2012 15:10:47 GMT