- From: Mark Davis <mark.davis@jtcsv.com>
- Date: Wed, 15 Dec 2004 07:20:07 -0800
- To: "Tex Texin" <tex@xencraft.com>
- Cc: <www-international@w3.org>, <ietf-languages@alvestrand.no>
> if the experts don't agree on the codes to use, > either because the codes are ambiguous or because the decision process is so > complex This is really not the point. What does one call a dog? I could refer to it at many levels of detail. Animalia: Chordata: Mammalia: Carnivora: Canidae: Canis: Familiaris: down to breeds: Working: Herding: Sheepdog: German Shepard... (And there is some disagreement on whether it should be Canis familiaris or Canis lupes familiaris.) And I might want to distinguish a gray German Shepard dog from one with some tan, whereas you might not. Which is chosen depends on whether I want to be more or less specific, and make distinctions that you may not want to make. John indicated that the purpose of the list is "plausible" language tags. If that is the critera, then without having to do extensive and fairly difficult research, I'd say that it is each 639 code alone, then for each tag add the combinations of scripts that are used with it. Then for each of those tags that have significant speaker populations in different regions, add the combinations. Rather than have an unholy long list, this would be far easier both to use and to maintain if composed of two tables: language subtag => scripts in use language {-script} tag => regions where used Mark ----- Original Message ----- From: "Tex Texin" <tex@xencraft.com> Cc: <www-international@w3.org>; <ietf-languages@alvestrand.no> Sent: Wednesday, December 15, 2004 03:38 Subject: Re: Language Identifier List up for comments > I have made some updates to the page. > http://www.i18nguy.com/unicode/language-identifiers.html > > The mail volume and the fact that I get 3 copies of each, means it is going to > take me some time to sort thru. > (Feel free to take my name off the mail, I am subscribed to both lists where > the thread appears.) > > My thanks to those of you that sent me private suggestions for language codes > that I can research as to whether they are different in different countries, > but I won't have time to do research. (Nor the appropriate skills.) > > I noted conflicting advice on cy and whether pategonian is different from the > version in the UK. > For now both entries are in the table, and some of you can debate which is > correct. > For that matter it is not clear to me whether some of the en entries aren't > close enough to be the same. > > I began the table of one-level entries. > At some point, every 639 entry should be in one or the other table. > > I am glad to see all the caveats being pointed out in the thread about > dependencies on usage, context, and how significant a language difference needs > to be. To my feeble mind, if the experts don't agree on the codes to use, > either because the codes are ambiguous or because the decision process is so > complex, then surely there is no hope for the majority of the community that is > responsible for choosing language tags. Which was my point. > > I still conclude that simple instructions that don't require decisions based on > information that is not generally available, is the more reliable model. It is > better for users and better for application developers. > > For the applications that linguists use, where the distinctions are much more > important, the current state of the art might be reasonable. (But I wouldn't > bet on it.) > > Cheers, > > Tex > > _______________________________________________ > Ietf-languages mailing list > Ietf-languages@alvestrand.no > http://www.alvestrand.no/mailman/listinfo/ietf-languages >
Received on Wednesday, 15 December 2004 15:20:21 UTC