RE: 8 characters

Hello Dan,

This is a personal response. The Internationalization WG will discuss this at our call on Wednesday.

I would prefer is your extension did not define single subtags that exceeded eight characters in length except by creating dependencies between subtags (the proposed ‘u’ extension, of which I am an author, uses this).

That is, I think this is probably not a good idea:   en-p-pronunci-ation

Whereas I could see something like: en-p-pinyin-2001 (with date defining a specific version) or en-p-key-value

But you are correct that you can define whatever you want, as long as it is consistent with BCP 47 syntax rules.

I slightly disagree about your second point. There are two filtering schemes. Basic filtering can, of course, be used to select any language tag set. Extended filtering are the rules you cite.

The rule 3.3.2.D  doesn’t apply the way you think it does. It exists to stop false-friend matches. Consider this range: “en-fubar”. It matches tags like “en-DE-fubar”, “en-Latn-fubar”, “en-bootn-fubar”, and so forth. Rule 3.3.2.D prevents it from matching “en-p-fubar”. It does NOT prevent one from having a range like “en-p-fubar” to find tags like “en-US-p-fubar”. And it doesn’t prevent a match like “en-fubar-p-pinyin”.

Your extension may define matching rules that implementations may consider when supporting the extension. Certainly filtering is not very useful in matching extensions themselves.

Best Regards,

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: Dan Burnett [mailto:dburnett@voxeo.com]
Sent: Monday, February 01, 2010 8:23 AM
To: Phillips, Addison
Cc: public-i18n-core@w3.org; Matt Womer; kazuyuki@w3.org; W3C Voice Browser Working Group
Subject: Re: 8 characters

Hello Addison and I18N Core WG,

The SSML subgroup reviewed this topic (and your suggestion to adopt the "p" extension space) in our call last week.  Based on our reading of both the syntax for language tags and, in particular, extension tags, as well as the description of the extended filtering algorithm, we do not believe there is any problem with defining extension tags containing hyphens (meaning that the extension tag following the 'p' singleton may consist of multiple hyphen-delimited subtags).


In addition to the ABNF definition itself, the two most relevant statements we found in BCP 47 are

5646, sec. 2.2.6:  "All subtags following the singleton and before another singleton are part of the extension."

and 5647, sec. 3.3.2:  "Else, if the language tag's subtag is a "singleton" (a single letter or digit, which includes the private-use subtag 'x') the match fails."


The latter suggests that singletons and their extensions cannot be used in filtering, which is as we would expect for our registry.


Can you please let us know whether or not you agree with our reading, and if not, explain why the quoted statements above do not govern in this case?

Thanks!

-- dan

On Jan 19, 2010, at 11:37 AM, Phillips, Addison wrote:


Hello Dan,

There is no chance that the eight character limit can be extended for reasons of compatibility. Using “-“ to continue a tag might have some unintended side-effects. Notably, matching and truncation of language tags assumes that subtags can be removed at the hyphen mark.

If you can’t stay within an eight character limit, you might consider having a “short name” field for use in language tags (as an alias). The subtags don’t have to be perfectly mnemonic.

Addison

Addison Phillips
Globalization Architect -- Lab126

Internationalization is not a feature.
It is an architecture.

From: public-i18n-core-request@w3.org<mailto:public-i18n-core-request@w3.org> [mailto:public-i18n-core-request@w3.org] On Behalf Of Dan Burnett
Sent: Tuesday, January 19, 2010 6:13 AM
To: public-i18n-core@w3.org<mailto:public-i18n-core@w3.org>; Matt Womer; kazuyuki@w3.org<mailto:kazuyuki@w3.org>
Cc: W3C Voice Browser Working Group
Subject: Fwd: 8 characters

Forwarding to entire I18N core group.

Begin forwarded message:



From: Dan Burnett <dburnett@voxeo.com<mailto:dburnett@voxeo.com>>
Date: January 13, 2010 1:24:10 PM EST
To: Addison Phillips <addison@amazon.com<mailto:addison@amazon.com>>
Subject: 8 characters

Addison,

After I left the call it occurred to me that our first entry that we want to create, "pinyin2001", would not meet your 8-character restriction.

Is there any possibility for either a) increasing that character limit, or b) allowing for us to use '-' to continue our tags?

I expect that the character length limit, as in MSDOS, will be the biggest sticking point.

-- dan

Received on Monday, 1 February 2010 16:38:49 UTC