W3C home > Mailing lists > Public > www-international@w3.org > October to December 2006

Re: rfc4646. Some analysis code

From: Addison Phillips <addison@yahoo-inc.com>
Date: Tue, 07 Nov 2006 12:01:31 -0800
Message-ID: <4550E61B.7020609@yahoo-inc.com>
To: Dave Pawson <dave.pawson@gmail.com>
CC: I18N <www-international@w3.org>

Grandfathered tags are those registered under RFC 1766 or RFC 3066 whose 
subtags are not all in the subtag registry.

There are two classes of grandfathered tags:

1. Tags which are "well-formed" but for which some subtags are not 

2. Tags which are not well-formed and which are only valid as 
grandfathered tags (which are called "irregular").

If your processor does validation, you need the complete list of 
grandfathered tags. If your processor only does well-formedness 
checking, you only need the list of irregular tags:

irregular     = "en-GB-oed" / "i-ami" / "i-bnn" / "i-default"
               / "i-enochian" / "i-hak" / "i-klingon" / "i-lux"
               / "i-mingo" / "i-navajo" / "i-pwn" / "i-tao"
               / "i-tay" / "i-tsu" / "sgn-BE-fr" / "sgn-BE-nl"
               / "sgn-CH-de"

This latter list, please note is closed, as in it will never change.

Please note that there is test data in some of the LTRU WG archives 
which you would do well to try.

Best Regards,


Dave Pawson wrote:
> After much struggling, I've posted a first attempt at
> parsing langtag from rfc4646. Seemingly there isn't any code
> around. There is now.
> http://www.dpawson.co.uk/java/rfc4646.html
> GPL.
> I couldn't make much sense of  granfathered. I'll try and add it
> in if someone would be kind enough to explain it.
> the regexs and tests cover all the sub elements of the rfc.
> Enjoy.
> Any problems, please let me know. I'll try and fix it.
> regards

Addison Phillips
Globalization Architect -- Yahoo! Inc.

Internationalization is an architecture.
It is not a feature.
Received on Tuesday, 7 November 2006 20:02:01 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 22:40:53 UTC