- From: Mark Davis <mark.davis@jtcsv.com>
- Date: Fri, 17 Dec 2004 08:09:05 -0800
- To: "John Cowan" <jcowan@reutershealth.com>, "Tex Texin" <tex@xencraft.com>
- Cc: <www-international@w3.org>, <ietf-languages@alvestrand.no>
I agree that there has been some useful dialog about this topic; always helps to center it when people are faced with a real list. The language on the page is still extremely misleading, however. Here are my recommendations: First, best to always use region instead of 'country'. Many of the regions are not countries, and some people get miffed about it. Language identifiers as specified by RFC 3066, can have the form language, language-country, language-country-variant and some other specialized forms. The guidelines for choosing between language and language-country are ambiguous. [The guidelines are clear; what is not clear is when there is a physical difference. Talk of "ambiguity" is very misleading. The tags aren't ambiguous; the most you can say is that the languages that they denote are not materially different, for some definition of "materially". Moreover, "language identifier" is used 'ambiguously' -- you have language identifier mean both a language tag, and a language tag fromed from one lang subtag.] => Language identifiers (tags) as specified by RFC 3066, can have the form lang, lang-region, and some other specialized forms, where lang and region are subtags using ISO codes. (There is a [http://www.inter-locale.com/ID/draft-phillips-langtags-08.html proposed successor] to RFC 3066 that extends this further.) However the RFC does not identify which lang-region identfiers do not distinguish a written form that is, for most localization purposes, materially different from that distinguished by the corresponding lang identifiers. This table lists the languages which have no other significant variations, and therefore can be adequately represented by a language subtag alone, as opposed to a language subtag and country subtag. In this table, where the identifiers show the country tag, but it can be removed without causing ambiguity. [This is way too definitive, and the last sentence is just plain wrong: the list owners can never say that there are no significant variations. Moreover, it "the languages" makes it out to be a complete list, which it will never be. Even if the data were true and well-known, a complete list would be *anything* starting with af, am, as, ... => This table lists some lang-region identifiers which, for most localization purposes, do not need to have the region subtag included. Some languages are spoken in many countries, and the language is not distinctive in each country. I have started to accept suggestions as to which language-region codes do not represent a distinct language variation, and therefore are not recommended as tags, without good reason. [looks like old, redundant text. nuke.] The tags which are not recommended will look like this sentence. [Whoa - not recommended? The one example given is way off the mark! de‑AT, de‑CH, de‑DE, (de‑BE, de‑DK, de‑LI, de‑LU) I use (...) for the 'not recommended' since the color distinction will not show here. de‑LI absolutely has a meaning. de-LI is certainly as different from de-DE as de-CH is! The recommendation by the text that de-LI should just be replaced by "de" is *way* off. The most you could do is say, for example, the following: de‑AT, de‑CH (de‑LI), de‑DE (de‑BE, de‑DK, de‑LU) and say that the identifiers in (...) are ones that do not materially differ in denotation from the one listed before them, for most localization purposes. Even that is pretty dicy.] Table... en‑AG, en‑AI, en‑AS, en‑AU, en‑IN, en‑BB, en‑BE, en‑BM, en‑BN, en‑BS, en‑BW, en‑BZ, en‑CA, en‑CK, en‑CM, en‑DM, en‑ER, en‑ET, en‑FJ, en‑FK, en‑FM, en‑GB, en‑GD, en‑GH, en‑GI, en‑GM, en‑GU, en‑GY, en‑HK, en‑IE, en‑IL, en‑IO, en‑JM, en‑KE, en‑KI, en‑KN, en‑KY, en‑LC, en‑LR, en‑LS, en‑MH, en‑MP, en‑MS, en‑MT, en‑MU, en‑MW, en‑NA, en‑NF, en‑NG, en‑NR, en‑NU, en‑NZ, en‑PG, en‑PH, en‑PK, en‑PN, en‑PR, en‑PW, en‑RW, en‑SB, en‑SC, en‑SG, en‑SH, en‑SL, en‑SO, en‑SZ, en‑TC, en‑TK, en‑TO, en‑TT, en‑UG, en‑UM, en‑US, en‑VC, en‑VG, en‑VI, en‑VU, en‑WS, en‑ZA, en‑ZM, en‑ZW English If you want feedback on the table from those who have not memorized country codes, and to make it more comprehensible to people, I suggest you include a more descriptive name. Even better would be to have an alternate table or column, but that might be more maintanence for you. I'd also suggest having the language on the left. Included descriptive name en (English) en-AG (Antigua and Barbuda), en-AI (Anguilla), en-AS (American Samoa), en-AU (Australia), en-IN (India), en-BB (Barbados), en-BE (Belgium), en-BM (Bermuda), en-BN (Brunei), en-BS (Bahamas), en-BW (Botswana), en-BZ (Belize), en-CA (Canada), en-CK (Cook Islands), en-CM (Cameroon), en-DM (Dominica), en-ER (Eritrea), en-ET (Ethiopia), en-FJ (Fiji), en-FK (Falkland Islands), en-FM (Micronesia), en-GB (United Kingdom), en-GD (Grenada), en-GH (Ghana), en-GI (Gibraltar), en-GM (Gambia), en-GU (Guam), en-GY (Guyana), en-HK (Hong Kong S.A.R., China), en-IE (Ireland), en-IL (Israel), en-IO (British Indian Ocean Territory), en-JM (Jamaica), en-KE (Kenya), en-KI (Kiribati), en-KN (Saint Kitts and Nevis), en-KY (Cayman Islands), en-LC (Saint Lucia), en-LR (Liberia), en-LS (Lesotho), en-MH (Marshall Islands), en-MP (Northern Mariana Islands), en-MS (Montserrat), en-MT (Malta), en-MU (Mauritius), en-MW (Malawi), en-NA (Namibia), en-NF (Norfolk Island), en-NG (Nigeria), en-NR (Nauru), en-NU (Niue), en-NZ (New Zealand), en-PG (Papua New Guinea), en-PH (Philippines), en-PK (Pakistan), en-PN (Pitcairn), en-PR (Puerto Rico), en-PW (Palau), en-RW (Rwanda), en-SB (Solomon Islands), en-SC (Seychelles), en-SG (Singapore), en-SH (Saint Helena), en-SL (Sierra Leone), en-SO (Somalia), en-SZ (Swaziland), en-TC (Turks and Caicos Islands), en-TK (Tokelau), en-TO (Tonga), en-TT (Trinidad and Tobago), en-UG (Uganda), en-UM (United States Minor Outlying Islands), en-US (United States), en-VC (Saint Vincent and the Grenadines), en-VG (British Virgin Islands), en-VI (U.S. Virgin Islands), en-VU (Vanuatu), en-WS (Samoa), en-ZA (South Africa), en-ZM (Zambia), en-ZW (Zimbabwe) Alternate Table/Column English Antigua and Barbuda, Anguilla, American Samoa, Australia, India, Barbados, Belgium, Bermuda, Brunei, Bahamas, Botswana, Belize, Canada, Cook Islands, Cameroon, Dominica, Eritrea, Ethiopia, Fiji, Falkland Islands, Micronesia, United Kingdom, Grenada, Ghana, Gibraltar, Gambia, Guam, Guyana, Hong Kong S.A.R., China, Ireland, Israel, British Indian Ocean Territory, Jamaica, Kenya, Kiribati, Saint Kitts and Nevis, Cayman Islands, Saint Lucia, Liberia, Lesotho, Marshall Islands, Northern Mariana Islands, Montserrat, Malta, Mauritius, Malawi, Namibia, Norfolk Island, Nigeria, Nauru, Niue, New Zealand, Papua New Guinea, Philippines, Pakistan, Pitcairn, Puerto Rico, Palau, Rwanda, Solomon Islands, Seychelles, Singapore, Saint Helena, Sierra Leone, Somalia, Swaziland, Turks and Caicos Islands, Tokelau, Tonga, Trinidad and Tobago, Uganda, United States Minor Outlying Islands, United States, Saint Vincent and the Grenadines, British Virgin Islands, U.S. Virgin Islands, Vanuatu, Samoa, South Africa, Zambia, Zimbabwe And given such a list, some items stand out. It is unclear why you should have variants for English as in China or Israel, but not English as in Russia or Egypt, for example. Mark ----- Original Message ----- From: "John Cowan" <jcowan@reutershealth.com> To: "Martin Duerst" <duerst@w3.org> Cc: "Tex Texin" <tex@xencraft.com>; <www-international@w3.org>; <ietf-languages@alvestrand.no> Sent: Friday, December 17, 2004 04:55 Subject: Re: Language Identifier List up for comments > Martin Duerst scripsit: > > > - I think there has been enough cross-posting. I suggest we all > > limit further posts to ietf-languages@alvestrand.no. > > Please direct followups only to that list. > > If anything, I think the interest and expertise exist mainly on > www-international. From the point of view of ietf-languags, these tags > are all valid, period; "best practice" is not as central a concern > there. (I know this because my attempts to get the list reviewed by > ietf-languages have always gone nowhere, whereas this attempt is getting > lots of review.) > > > - "Proposed List of 1-level Language Identifiers": Why on earth > > are two-level codes given when it says that one-level codes > > are the right thing to use? Please, please, don't confuse > > the readers with such stuff, and remove the country codes > > from the identifiers as quickly as possible. > > I agree completely. In addition, I think the entire third list should > be migrated to the first list. These are simply the codes for which > regional variation on the national level is *not known* to exist (as > opposed to codes for which r.v. on the n.l. is *known not* to exist). > > In pursuit of that, the introductions to the two lists should be changed > from "languages which have no other significant variations" to "languages > which are not known to vary significantly in different countries", and > likewise "languages which differ by region" should be "languages which > vary significantly in different countries". > > -- > Go, and never darken my towels again! John Cowan > --Rufus T. Firefly www.ccil.org/~cowan > _______________________________________________ > Ietf-languages mailing list > Ietf-languages@alvestrand.no > http://www.alvestrand.no/mailman/listinfo/ietf-languages >
Received on Friday, 17 December 2004 16:09:11 UTC