Seaborne, Andy wrote: > Arjohn Kampman wrote: >> - <LANGTAG> only allows language tags that consists of max two >> components. However, the following document also seems to use tags >> with three or more tags like "zh-min-nan" and "en-GB-oed": >> http://www.iana.org/assignments/language-tags > > > I believe the grammar is already aligned with RFC 3066. The token A2Z > is not a > 2 characters; it's "A to Z". > > <LANGTAG> ::= '@' <A2Z>+ ('-' (<A2ZN>)+)? > > and RDF 3006 has: > > Language-Tag = Primary-subtag *( "-" Subtag ) > Primary-subtag = 1*8ALPHA > Subtag = 1*8(ALPHA / DIGIT) > > (I also note that the language-tags document has tags that are not > covered by the production in 3066). Please note that these two production rules are not equivalent: the former allows 0 or 1 subtags but the latter allows 0 or more. The '*' character before the subtag part denotes this (don't you just love these alternative bnf notations...;-)). This should be fixed by replacing the '?' character with a '*' in the former rule, i.e.: replace: <LANGTAG> ::= '@' <A2Z>+ ('-' (<A2ZN>)+)? with <LANGTAG> ::= '@' <A2Z>+ ('-' (<A2ZN>)+)* -- ArjohnReceived on Tuesday, 29 March 2005 12:27:01 GMT
This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 8 January 2008 14:14:48 GMT