Re: Language Identifier List Comments, updated

Tex,

some comments below based on my personal knowledge and background...

- I am surprised to see hu-HU and hu-SI as the only Hungarian extra tags. The Hungarian 
minority in Slovenia is very small, the three biggest Hungarian minorities are in Romania, 
Serbia and Slovakia, with also a minority in Ukraine and Austria. Even the last two are 
(afaik) larger than the one in Slovenia. So either one has to add them all, or none... 
B.t.w., if we really want to come no niceties, there is also a slovakian and serbian 
minority in Hungary, so, eg, a sk-HU might be an issue (I certainly heard people speaking 
serbian around Budapest when I was a kid). Finally, there is also a German minority in 
Hungary, well alive and kicking (as an anecdote: the current German Foreign Minister, 
Joschka Fisher, comes from that community...)

- AFAIK, catalan is also an official language of Andorra. You pick French there, then one 
must be systematic and add catalan, too...

- I also think that for a casual reader it helps to say explicitly that zh-hant is the 
Traditional Script in Chinese and zh-hans is the Simplified one. There is no distinction 
right now. Also, I would add zh-hant to Hong Kong, and Macao, too.

- You list frisian both for the Netherlands and Germany. I am not sure whether they are 
identical (just as fr-BE and fr-FR are not considered identical in this list...)

- I learn something every day... is walloon a genuinly different language? I thought fr-BE 
would cover it... just nl-BE covers Flemish!

- Political problem: you list Yugoslavia *and* Serbia/Montenegro. Both refer to the same 
political entity, afaik, and in both cases the Albanian of Kosovo has been forgotten! ;-(

- There is strong movement to revive Provencal in France. Whether it deserves a separate 
entry, I am not sure, but you might want to consider it

I hope that helps

Ivan

Tex Texin wrote:
> I have updated the page with a new format for the tables.
> There is now just one table which lists all of the 3166 region codes, and for
> each code (some of) the languages that are spoken in the region. As you know
> John Cowan provided the original data which was based on official languages.
> I have found a number of additions of official languages, and in some cases
> added languages that are not official but used in the region.
> 
> Unfortunately, I have had to rely on a few different sources and there isn't a
> consistent rule as to percentage of people speaking the language in a region to
> qualify it for listing. In some cases, the choices were based on the time I had
> available to invest in this effort.
> 
> The goal of the list is still to identify language codes that should be
> language tag alone, vs. language-region tag
> and/or a meaningful criteria for making the distinction.
> 
> Informed and constructive comments are welcomed.
> 
> The first row of the table represents languages that I have not yet identified
> as belonging to a region.
> 
> Some of the languages are "constructed" (interlingua, esperanto, ido) and do
> not belong to any region.
> I might move them to a separate row for "constructed" languages.
> 
> The 3166 codes may not be fully up to date. I noted the YU entry, which should
> be deprecated. The list needs checking for other errors or problems.
> 
> The page is:
> http://www.i18nguy.com/unicode/language-identifiers.html
> 
> tex
> 
> 
> 

-- 

Ivan Herman
W3C Communications Team, Head of Offices
C/o W3C Benelux Office at CWI, Kruislaan 413
1098SJ Amsterdam, The Netherlands
tel: +31-20-5924163; mobile: +31-641044153;
URL: http://www.w3.org/People/all?pictures=yes#ivan

Received on Monday, 27 December 2004 08:53:31 UTC