Re: [widgets] i18n

Hi Addison,

> A couple of notes:
>
> 1. The 3066 pattern is language-region, not the other way around.

Oops, that's what I meant :P

> 2. Don't reference RFC 4646. RFCs get obsoleted over time. Instead, reference BCP 47 (RFC 4646's designation in the IETF standards hierarchy).

Ok, good point. Mark Davis also pointed this out. Done.

> 3. Do reference RFC 4647 (as part of BCP 47) and, in particular, the Lookup matching scheme. I think you'll find that this is simple and consistent with existing practice.

Ok, I'll have a read of 4647 and see how I articulate that in the spec.

> 4. You may find that, if you recommend what you intend to, certain applications are hindered.
> In particular, some languages (Chinese!)use varying scripts and need the script subtag from
> RFC 4646. Your recommendation will stand in the way of that. Although a validating
> implementation of 4646 adds a bit of overhead, a "well-formed" implementation isn't nearly
> as difficult (it can be done with an admittedly-very-long regular expression). A better suggestion
> might be to recommend using the 3066 ABNF for "validation" (for its simplicity).

Thanks for pointing that out. We certainly don't want to hinder any
applications or exclude any languages.

In regards to regex, I found this:
http://unicode.org/cldr/data/tools/java/org/unicode/cldr/util/data/langtagRegex.txt

If it is known to be suitable, I can make a note in the spec that
implementers might like to look at the unicode regex code. At least in
takes the pain out of trying to decipher the ABNF into regex (or
having to implement an ABNF parser).

Kind regards,
Marcos

-- 
Marcos Caceres
http://datadriven.com.au

Received on Thursday, 15 May 2008 02:08:53 UTC