Re: rfc4646. Some analysis code

Grandfathered tags are those registered under RFC 1766 or RFC 3066 whose 
subtags are not all in the subtag registry.

There are two classes of grandfathered tags:

1. Tags which are "well-formed" but for which some subtags are not 
registered.

2. Tags which are not well-formed and which are only valid as 
grandfathered tags (which are called "irregular").

If your processor does validation, you need the complete list of 
grandfathered tags. If your processor only does well-formedness 
checking, you only need the list of irregular tags:

irregular     = "en-GB-oed" / "i-ami" / "i-bnn" / "i-default"
               / "i-enochian" / "i-hak" / "i-klingon" / "i-lux"
               / "i-mingo" / "i-navajo" / "i-pwn" / "i-tao"
               / "i-tay" / "i-tsu" / "sgn-BE-fr" / "sgn-BE-nl"
               / "sgn-CH-de"

This latter list, please note is closed, as in it will never change.

Please note that there is test data in some of the LTRU WG archives 
which you would do well to try.

Best Regards,

Addison

Dave Pawson wrote:
> 
> After much struggling, I've posted a first attempt at
> parsing langtag from rfc4646. Seemingly there isn't any code
> around. There is now.
> http://www.dpawson.co.uk/java/rfc4646.html
> GPL.
> 
> I couldn't make much sense of  granfathered. I'll try and add it
> in if someone would be kind enough to explain it.
> 
> the regexs and tests cover all the sub elements of the rfc.
> 
> Enjoy.
> Any problems, please let me know. I'll try and fix it.
> 
> 
> regards
> 

-- 
Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.

Received on Tuesday, 7 November 2006 20:02:01 UTC