W3C home > Mailing lists > Public > www-international@w3.org > October to December 2004

Re: Language Identifier List Comments, updated

From: Tex Texin <tex@xencraft.com>
Date: Mon, 27 Dec 2004 02:13:20 -0800
Message-ID: <41CFE040.7D8EE3BB@xencraft.com>
To: Ivan Herman <ivan@w3.org>
CC: John Cowan <jcowan@reutershealth.com>, WWW International <www-international@w3.org>, IETF Languages <ietf-languages@iana.org>

Thanks Ivan. I made most of the changes you and Andrew proposed.

1) I agree that the codes with han* ideally should indicate Simplified and
Traditional, and then I probably should also do the same for other scripts
(latn, cyrl, arab, etc.)
I can do it, although at some point, the table is going to get very large, and
I wonder if it isn't better for that information to not be provided with each
entry, and instead to provide a pointer to a table of script codes and names.
I'll come back to this later.

2) With respect to being "systematic" about adding languages, yes. But we need
criteria. 
John Cowan's list was cleverly restricted to listing "official" languages. Most
countries have groups speaking minority languages, or even majority languages
that are not "official". I have been a little more lax about what is included.
But I can't go about listing every language spoken in every region, and be done
with this in my life time. However, even for "official" languages the number of
speakers can be very low (Hausa had 500 speakers in Burkina Faso, where it is
an official language- unless that was a typo.). So if we start adding
unofficial languages, what number or percentage of
speakers should qualify a language to be added to the list for a region?

In any event, the table is supposed to be about determining the language codes
to use.
It is not clear to me when I add these codes whether I should add them with a
regional subtag or not.
Are they speaking the same language or a variation?

3) I don't know the answer for frisian.

4) Provencal is already listed, it goes by the code "oc" for Occitan. (If I
understand correctly.)
(Language names are a source of confusion. Sometimes what I would consider the
colloquoial english term is the official French term, and often the names have
many spellings and variants.)

5) My (now oft-repeated) objective is to drive or derive a criteria or policy
for name conventions with respect to the regional subtags. If a table of this
nature is of more general value (and I think it is) I am hopeful that some
linguistic body will take it on. I don't have the background or the time to
fully develop it.
SIL, Unicode, or some other organization would be much better at this and a
table of recommended tags would be an asset to many aspects of web and software
development and content.

tex
tex


Ivan Herman wrote:
> 
> Tex,
> 
> some comments below based on my personal knowledge and background...
> 
> - I am surprised to see hu-HU and hu-SI as the only Hungarian extra tags. The Hungarian
> minority in Slovenia is very small, the three biggest Hungarian minorities are in Romania,
> Serbia and Slovakia, with also a minority in Ukraine and Austria. Even the last two are
> (afaik) larger than the one in Slovenia. So either one has to add them all, or none...
> B.t.w., if we really want to come no niceties, there is also a slovakian and serbian
> minority in Hungary, so, eg, a sk-HU might be an issue (I certainly heard people speaking
> serbian around Budapest when I was a kid). Finally, there is also a German minority in
> Hungary, well alive and kicking (as an anecdote: the current German Foreign Minister,
> Joschka Fisher, comes from that community...)
> 
> - AFAIK, catalan is also an official language of Andorra. You pick French there, then one
> must be systematic and add catalan, too...
> 
> - I also think that for a casual reader it helps to say explicitly that zh-hant is the
> Traditional Script in Chinese and zh-hans is the Simplified one. There is no distinction
> right now. Also, I would add zh-hant to Hong Kong, and Macao, too.
> 
> - You list frisian both for the Netherlands and Germany. I am not sure whether they are
> identical (just as fr-BE and fr-FR are not considered identical in this list...)
> 
> - I learn something every day... is walloon a genuinly different language? I thought fr-BE
> would cover it... just nl-BE covers Flemish!
> 
> - Political problem: you list Yugoslavia *and* Serbia/Montenegro. Both refer to the same
> political entity, afaik, and in both cases the Albanian of Kosovo has been forgotten! ;-(
> 
> - There is strong movement to revive Provencal in France. Whether it deserves a separate
> entry, I am not sure, but you might want to consider it
> 
> I hope that helps
> 
> Ivan
Received on Monday, 27 December 2004 10:13:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:04 GMT