- From: Leif Halvard Silli <lhs@malform.no>
- Date: Fri, 25 Apr 2008 06:57:38 +0200
- To: www-international@w3.org
I am not certain that this is the right forum, but at least it is the best place I can think of, right now. I would like to outline some problems that I have experienced during the years with the language tags that BCP 47 has defined for Norwegian. Backround: Norwegian (no) has two written standards: Nynorsk (nn) and Bokmål (nb). Norway has the region code (NO). 3 language codes and 1 region code, for one and the same language! (The 'no' code covers both written standards.) As a summary, we can say that 1. Authors do not understand the language tags for Norwegian 2. As a result, they look at the tags used for English and "invent" similar tags for Norwegian. The latest example I can offer is the Norwegian government web site, www.government.no, which is available in 4 languages. At the time of writing, 2 of those language versions uses wrong language codes: * NO-NY is used to denote Norwegian Nynorsk, allthough these codes means "The language Norwegian in the region of NY" (wherever "NY" is - Ny York?) * NO-SE is used to denote Northern Sami language, allthough in reality NO-SE means "The language Norwegian in the region of Sweden". If we forget the NO-SE thing, and look at the NO-NY thing, then we can conclude that 1. NY (for NYnorsk) is used to denote a "variant" of NO (Norwegian). This the same pattern that we see when using en-GB, en-US, de-AT and so on. Thus, it is logical! And it is in line with how Norwegians perceive their two standards: Both are politically equal variants of Norwegian. 2. NY is also a non-existing code. Why is it used? Well, what do 'nn' mean to Norwegians? What do 'nb' mean to Norwegians? The ISO codes for Nynorsk and Bokmål are not something we, Norwegians, can link to from something we use in our daily lives. Typical Norwegian abbreviatios are, for Nynorsk: ny and nyn. For Bokmål: bm and bok. FOr that matter, in school pupils typically say "main tongue" and "side tongue". Depending on the region they live in and so on, it is usually clear - context wise - which language each term refers to. The current scheme, with 3 codes, even if poeple do not invent their own codes, has lead to an arms race about the right to use the code "no". The outcome of this race often looks like this: "NO" is used for Bokmål.. "NN" is used for Nynorsk. Is this how it was meant? Hardly. When a web site/resource exist in both tongues, then "nn" should be used on the Nynorsk and "nb" on the Bokmål.. It is also a "rights" issue. Both Nynorsk and Bokmål users want that their tonguge is treated equally to the other tongue. Another example: I come to a web site offering pages in Norwegian, and hence, expects to be served the page in Norwegian, of which I as Norwegian know there are two possible variants. But because I have set my web browser to ask for Norwegian Nynorsk, I get English instead ... The anomaly goes even into for example Mac OS X, which links NO and NB to Bokmål, while NN lives in no-mans land, alone. If you install an application with a NO language resource, then it registers as an Bokmål application. And if you install an application with a NN language resource, the user will not see it, unless he has be very active and very conscious about enabling support for Nynorsk. The Mac OS X system thinks a Norwegian user rather want English than Nynorsk. (Windows is better here, I think.) In my recent exchange with Apple I had to point them to Government.no to demonstrate that Norwegian users in fact expect Bokmål as fallback for Nynorsk and Nynorsk as fallback for Bokmål. However, the current language codes works *against* the understanding that both 'nb' and 'nn' are equal variants of 'no'. So what could a sensible solution be? Oh, yes, I am afraid I shall have to fight many fights, before I get people to realise this. But what we need is really simple and clear: The BCP 47 needs 2 new regions: A Norwegian Bokmål region. And a Norwegian Nynorsk region. Both regions covers the entire Norway. Having those regions, one could treat Norwegian the same as English, German etc: You denote a variant of the language by adding another subcode. This could be done simply by saying that NN means "Norway in Nynorsk/Nynorsk Norway" and that NB means "Norway in Bokmål/Bokmål Norway". As a result we would get these codese, which are allready in use anyway: * no-NN instead of nn and nn-NO * no-NB instead of nb and nb-NO Both NN and NB are free to use as region names. nb could be made an reduntant/grandfathered alias to no-NB and nn could be made same kind fo alias for no-NN. For the record: I think this is technically pure and clear. But to those that of course has to disagree: Consider the user over technical purity - to quote the design principles of HTML 5. It is also a fact that Norway to a degree is, politically, split in Nynorsk regions and Bokmål regions, which affects the administrative "tongue", language in school, the traffic signs, phone catalogues and other things. At the same time these regions go straight through the country, because any Norwegian has the right to use as "main tongue" the one standard he/she prefers. PS: BCP 47 has two "grandfathered" codes which I actually like quite well: no-nyn and no-bok. These codes are much more simple to understand, and much more in line with what users/authors would expect - and what I outline here- than the current, special, confusing 3 code solution for Norwegian. Unfortunatly, their status as "grandfathered" is that they are "on the way out". Best regards, Leif Halvard Silli Oslo, Norway
Received on Friday, 25 April 2008 04:59:52 UTC