- From: John Cowan <cowan@ccil.org>
- Date: Thu, 12 Apr 2007 13:00:13 -0400
- To: Mark Davis <mark.davis@icu-project.org>
- Cc: Richard Ishida <ishida@w3.org>, LTRU Working Group <ltru@ietf.org>, www-international@w3.org
Mark Davis scripsit: > The summary looks good. This discussion raises 2 items for the LTRU > group. > > Q1. What tag should be used where it is definitely a language, but there > is no code available yet? (This is an area where ISO 15924 is ahead > of ISO 639 (and 3166), since it has Zzzz: Code for uncoded script.) In principle, every natural-language item (text, audio, video) can be coded with some 639-2 code; if the language does not have a code of its own, it will belong to one of the 639-2 collections. For example, the language Tarifit (639-3 code 'rif') does not have a 639-2 code, but it is a Berber language; consequently, an item in Tarifit may be validly tagged 'ber', which represents the collection of Berber languages. Similarly, the language Zumbun (639-3 code 'jmb') does not have an 639-2 code, nor does it belong to any of the smaller 639-2 collections, but it does belong to the Afro-Asiatic language family; consequently, an item in Zumbun may be validly tagged 'afa', which represents the collection of Afro-Asiatic languages. If all else fails, as for the language isolate Burushaski (639-3 code 'bsk'), the 639-2 collection code 'mis', representing the collection of miscellaneous languages, may be applied. This is the ultimate fallback code, indicating that the language is known but nothing useful can be said about it using 639-2 codes. All of this lore, which represents the practice of the Library of Congress (the ultimate source of 639-2), can of course go away when RFC 4646bis goes into effect. If it is necessary to be more specific before then, and if strict compliance to 4646 is required, then rif-x-tarifit, afa-x-jumbun, and mis-x-burushas may also be used. > Q2. Clarify the wording around "und" vs "". "" is not a well-formed language tag according to RFC 4646, so there is nothing to say about it there. It is defined by the XML Recommendation as an extension to the set of language tags, and having the same significance as no language declaration at all. -- Dream projects long deferred John Cowan <cowan@ccil.org> usually bite the wax tadpole. http://www.ccil.org/~cowan --James Lileks
Received on Thursday, 12 April 2007 17:00:32 UTC