- From: John Cowan <cowan@ccil.org>
- Date: Tue, 7 Nov 2006 14:59:00 -0500
- To: Dave Pawson <dave.pawson@gmail.com>
- Cc: I18N <www-international@w3.org>
Dave Pawson scripsit: > No problem.... but how does it fit in with the > langtag = (language > ["-" script] > ["-" region] > *("-" variant) > *("-" extension) > > grouping? You make it sound like an alternate? is that right? Exactly. You can see the new draft ABNF at http://inter-locale.com/ID/draft-ietf-ltru-4646bis-01.html (when replaced, it'll become -02, -03, etc. etc.) > >"Grandfathered" is a semantic concept (the meaning of the tag cannot be > >deduced from its parts); "irregular" a syntactic one (the tag cannot be > >parsed into parts using the regular parsing algorithms). All irregular > >tags are grandfathered, but not all grandfathered tags are irregular. > >Unfortunately this distinction was not clarified until after 4646 was > >published. > > Mmmm. I won't pretend to understand that! I guess the discussion > was more than the resulting text :-) Okay, let me unpack that a bit. Most 4646 language tags follow the general pattern of language-script- region-variant, with all but the first part optional. The ABNF makes it possible to (a) recognize a well-formed tag and (b) take it apart into the four components. Then you can look in the Language Subtag Registry at http://www.iana.org/assignments/language-subtag-registry to find out what the various subtags mean. There are some exceptions, however, based on tags that were registered before we adopted these rules. For example, "sgn" means "sign languages" and "US" means "in the United States", but "sgn-us" does not mean "any sign language used in the United States", it means the specific sign language called "American Sign Language". A tag like this has the regular form, but its meaning is grandfathered. You can recognize a tag like this using the ABNF, but if you try to understand its meaning piece by piece, you get the wrong answer. All such tags are listed in the Language Subtag Registry. Furthermore, some of the grandfathered tags are also irregular: they don't match the language-script-region-variant pattern at all, and you cannot take them apart. "i-hak" is an example of this: it means "Hakka Chinese". These tags are also listed in the Language Subtag Registry. > One more if I may. > > Same topic. > > grandfathered = 1*3ALPHA 1*2("-" (2*8alphanum)) > ; grandfathered registration > ; Note: i is the only singleton > ; that starts a grandfathered tag > > Why wasn't it > i 1*2ALPHA ...... > Any particular reason? That's not what the comment is telling you. It says that various 2ALPHA and 3ALPHA subtags can begin a grandfathered tag, but the only 1ALPHA subtag ("singleton") that can begin one is "i". Anyhow, that doesn't matter if you just use the list of 17 irregular tags directly and not worry about this definition. -- Take two turkeys, one goose, four John Cowan cabbages, but no duck, and mix them http://www.ccil.org/~cowan together. After one taste, you'll duck cowan@ccil.org soup the rest of your life. --Groucho
Received on Tuesday, 7 November 2006 19:59:17 UTC