W3C home > Mailing lists > Public > www-international@w3.org > April to June 2008

2 many language tags for Norwegian

From: Leif Halvard Silli <lhs@malform.no>
Date: Fri, 25 Apr 2008 06:57:38 +0200
Message-ID: <481164C2.2050608@malform.no>
To: www-international@w3.org

I am not certain that this is the right forum, but at least it is the 
best place I can think of, right now.

I would like to outline some problems that I have experienced during the 
years with the language tags that BCP 47 has defined for Norwegian.

Backround: Norwegian (no) has two written standards: Nynorsk (nn) and 
Bokmål (nb). Norway has the region code (NO). 3 language codes and 1 
region code, for one and the same language! (The 'no' code covers both 
written standards.)

As a summary, we can say that

   1. Authors do not understand the language tags for Norwegian
   2. As a result, they look at the tags used for English and "invent"
      similar tags for Norwegian.

The latest example I can offer is the Norwegian government web site, 
www.government.no, which is available in 4 languages. At the time of 
writing, 2 of those language versions uses wrong language codes:

    * NO-NY is used to denote Norwegian Nynorsk, allthough these codes
      means "The language Norwegian in the region of NY" (wherever "NY"
      is - Ny York?)
    * NO-SE is used to denote Northern Sami language, allthough in
      reality NO-SE means "The language Norwegian in the region of Sweden".

If we forget the NO-SE thing, and look at the NO-NY thing, then we can 
conclude that

   1. NY (for NYnorsk) is used to denote a "variant" of NO (Norwegian).
      This the same pattern that we see when using en-GB, en-US, de-AT
      and so on. Thus, it is logical! And it is in line with how
      Norwegians perceive their two standards: Both are politically
      equal variants of Norwegian.
   2. NY is also a non-existing code. Why is it used? Well, what do 'nn'
      mean to Norwegians? What do 'nb' mean to Norwegians? The ISO codes
      for Nynorsk and Bokmål are not something we, Norwegians, can link
      to from something we use in our daily lives. Typical Norwegian
      abbreviatios are, for Nynorsk: ny and nyn. For Bokmål: bm and bok.
      FOr that matter, in school pupils typically say "main tongue" and
      "side tongue". Depending on the region they live in and so on, it
      is usually clear - context wise - which language each term refers to.

The current scheme, with 3 codes, even if poeple do not invent their own 
codes, has lead to an arms race about the right to use the code "no". 
The outcome of this race often looks like this: "NO" is used for Bokmål.. 
"NN" is used for Nynorsk.

Is this how it was meant? Hardly. When a web site/resource exist in both 
tongues, then "nn" should be used on the Nynorsk and "nb" on the Bokmål.. 
It is also a "rights" issue. Both Nynorsk and Bokmål users want that 
their tonguge is treated equally to the other tongue.

Another example: I come to a web site offering pages in Norwegian, and 
hence, expects to be served the page in Norwegian, of which I as 
Norwegian know there are two possible variants. But because I have set 
my web browser to ask for Norwegian Nynorsk, I get English instead ...

The anomaly goes even into for example Mac OS X, which links NO and NB 
to Bokmål, while NN lives in no-mans land, alone. If you install an 
application with a NO language resource, then it registers as an Bokmål 
application. And if you install an application with a NN language 
resource, the user will not see it, unless he has be very active and 
very conscious about enabling support for Nynorsk. The Mac OS X system 
thinks a Norwegian user rather want English than Nynorsk. (Windows is 
better here, I think.)  In my recent exchange with Apple I had to point 
them to Government.no to demonstrate that Norwegian users in fact expect 
Bokmål as fallback for Nynorsk and Nynorsk as fallback for Bokmål.

However, the current language codes works *against* the understanding 
that both 'nb' and 'nn' are equal variants of 'no'.

So what could a sensible solution be?

Oh, yes, I am afraid I shall have to fight many fights, before I get 
people to realise this. But what we need is really simple and clear:

The BCP 47 needs 2 new regions: A Norwegian Bokmål region. And a 
Norwegian Nynorsk region. Both regions covers the entire Norway. Having 
those regions, one could treat Norwegian the same as English, German 
etc: You denote a variant of the language by adding another subcode.

This could be done simply by saying that NN means "Norway in 
Nynorsk/Nynorsk Norway" and that NB means "Norway in Bokmål/Bokmål 
Norway". As a result we would get these codese, which are allready in 
use anyway:

    * no-NN instead of nn and nn-NO
    * no-NB instead of nb and nb-NO

Both NN and NB are free to use as region names. nb could be made an 
reduntant/grandfathered alias to no-NB and nn could be made same kind fo 
alias for no-NN.

For the record: I think this is technically pure and clear. But to those 
that of course has to disagree: Consider the user over technical purity 
- to quote the design principles of HTML 5.

It is also a fact that Norway to a degree is, politically, split in 
Nynorsk regions and Bokmål regions, which affects the administrative 
"tongue", language in school, the traffic signs, phone catalogues and 
other things. At the same time these regions go straight through the 
country, because any Norwegian has the right to use as "main tongue" the 
one standard he/she prefers.

PS: BCP 47 has two "grandfathered" codes which I actually like quite 
well: no-nyn and no-bok. These codes are much more simple to understand, 
and much more in line with what users/authors would expect - and what I 
outline here-  than the current, special, confusing 3 code solution for 
Norwegian. Unfortunatly, their status as "grandfathered" is that they 
are "on the way out".

Best regards,
Leif Halvard Silli
Oslo, Norway
Received on Friday, 25 April 2008 04:59:52 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:17:17 GMT