- From: Mark Davis <mark.davis@icu-project.org>
- Date: Thu, 12 Apr 2007 10:29:20 -0700
- To: "John Cowan" <cowan@ccil.org>
- Cc: "Richard Ishida" <ishida@w3.org>, "LTRU Working Group" <ltru@ietf.org>, www-international@w3.org
- Message-ID: <30b660a20704121029s2c04f403n4e3cca83ec05e12b@mail.gmail.com>
Q1. I had missed the choice of "mis". I agree with that suggestion; we should incorporate that into 4646bis. The problem is ameliorated considerably once we add -3, but it doesn't disappear completely, so "mis" remains a good choice for dealing with that situation. Q2. The issue *does* remain, since we talk about "und" vs the absence of a language tag, which "" represents. Mark On 4/12/07, John Cowan <cowan@ccil.org> wrote: > > Mark Davis scripsit: > > > The summary looks good. This discussion raises 2 items for the LTRU > > group. > > > > Q1. What tag should be used where it is definitely a language, but there > > is no code available yet? (This is an area where ISO 15924 is ahead > > of ISO 639 (and 3166), since it has Zzzz: Code for uncoded script.) > > In principle, every natural-language item (text, audio, video) can be > coded with some 639-2 code; if the language does not have a code of its > own, it will belong to one of the 639-2 collections. > > For example, the language Tarifit (639-3 code 'rif') does not have a 639-2 > code, but it is a Berber language; consequently, an item in Tarifit may be > validly tagged 'ber', which represents the collection of Berber languages. > Similarly, the language Zumbun (639-3 code 'jmb') does not have an 639-2 > code, nor does it belong to any of the smaller 639-2 collections, but it > does belong to the Afro-Asiatic language family; consequently, an item > in Zumbun may be validly tagged 'afa', which represents the collection > of Afro-Asiatic languages. > > If all else fails, as for the language isolate Burushaski (639-3 code > 'bsk'), the 639-2 collection code 'mis', representing the collection of > miscellaneous languages, may be applied. This is the ultimate fallback > code, indicating that the language is known but nothing useful can be > said about it using 639-2 codes. > > All of this lore, which represents the practice of the Library of Congress > (the ultimate source of 639-2), can of course go away when RFC 4646bis > goes into effect. If it is necessary to be more specific before then, > and if strict compliance to 4646 is required, then rif-x-tarifit, > afa-x-jumbun, and mis-x-burushas may also be used. > > > Q2. Clarify the wording around "und" vs "". > > "" is not a well-formed language tag according to RFC 4646, so there is > nothing to say about it there. It is defined by the XML Recommendation as > an extension to the set of language tags, and having the same significance > as no language declaration at all. > > -- > Dream projects long deferred John Cowan <cowan@ccil.org> > usually bite the wax tadpole. http://www.ccil.org/~cowan > --James Lileks > -- Mark
Received on Thursday, 12 April 2007 17:29:32 UTC