Re: rfc4646. Some analysis code

Dave Pawson scripsit:

> After much struggling, I've posted a first attempt at parsing langtag
> from rfc4646. Seemingly there isn't any code around. There is now.
> http://www.dpawson.co.uk/java/rfc4646.html GPL.

Excellent!

> I couldn't make much sense of  granfathered. I'll try and add it
> in if someone would be kind enough to explain it.

You should ignore the "grandfathered" production in the ABNF altogether.
It will be replaced in the next RFC (temporarily called "RFC 4646bis")
with the following production:

irregular     = "en-GB-oed" / "i-ami" / "i-bnn" / "i-default"
              / "i-enochian" / "i-hak" / "i-klingon" / "i-lux"
              / "i-mingo" / "i-navajo" / "i-pwn" / "i-tao"
              / "i-tay" / "i-tsu" / "sgn-BE-fr" / "sgn-BE-nl"
              / "sgn-CH-de"

Your code should simply check if a tag is case-insensitively equal to
any of those 17 strings, and if so, declare it well-formed without further
investigation.  This list is permanently fixed, so it is safe to freeze
it into code.

"Grandfathered" is a semantic concept (the meaning of the tag cannot be
deduced from its parts); "irregular" a syntactic one (the tag cannot be
parsed into parts using the regular parsing algorithms).  All irregular
tags are grandfathered, but not all grandfathered tags are irregular.
Unfortunately this distinction was not clarified until after 4646 was
published.

-- 
At the end of the Metatarsal Age, the dinosaurs     John Cowan
abruptly vanished. The theory that a single         cowan@ccil.org
catastrophic event may have been responsible        http://www.ccil.org/~cowan
has been strengthened by the recent discovery of
a worldwide layer of whipped cream marking the
Creosote-Tutelary boundary.             --Science Made Stupid

Received on Tuesday, 7 November 2006 19:26:13 UTC