- From: Leif Halvard Silli <lhs@malform.no>
- Date: Sat, 26 Apr 2008 02:47:08 +0200
- To: Frank Ellermann <hmdmhdfmhdjmzdtjmzdtzktdkztdjz@gmail.com>
- CC: www-international@w3.org
Frank Ellermann:
> Leif Halvard Silli wrote:
>
> > If I come to a web site which sends out 'en', with a web browser
> > asking for 'en-GB', won't I then recive 'en'? Yes I will.
>
> Hopefully, IIRC Google treated en-GB-oed as unknown (= not "en")
> when searching for "en" results. I can't check it at the moment.
>
I am uncertain what the oed (Oxford English Dictionary spelling) tag is
for.
But to examplify what I talked about: On my Apache installation, when I
set Firefox to prefer 'en' and insert "AddLanguage en-GB .en-gb" in my
Apache configuration file, and if the only available English version is
index.html.en-gb, then Firefox will get and open that file.
This is the key thing. How can we get this to happen for Bokmål and
Nynorsk?
If it is possible to get the browser to a) send out that it prefers
'nn', while b) at the same time get it to fall back to no or nb if nn
isn't awailable.
It should be simple. When I select 'de', then I get de-AT if nothing
else in German is awailable. When selecting 'en' then en-GB get served
if nothing else is available in English.
> When some taggers abuse "nn" for unknown it's certainly bad for
> real "nn" applications.
I did not encounter such a use of nn, myself.
> FWIW, en-UK also is not what many folks
> would think, it's not that "no" is the only language subtag with
> such issues. Some cases of de / gsw / lb / nds / ... can be also
> "less than obvious" (putting it as mildly as possible, nothing is
> wrong with these subtags).
>
The point I made about 'nb' and 'nn' not being "obvious" to Norwegians,
was mostly a sidepoint. In fact, I proposed to resuse 'NB' and 'NN' as
region nammes - even if more obvious (for Norwegians) names exist.
> But you can set your browser to permit nb, nn, and no. How some
> taggers abuse nn is a matter of education, as for s/UK/GB/g.
>
I can set *my* browser to permit nb, nn and no. But not any browser. Not
on OS X, at least. On OS X, the browser (Safari/Webkit and those that
interact with the system - Camino/Opera) only sends out one accept
language header. Thus, if I want to *prefer* Nynorsk, I must place it as
the first preference in the language list of OSX. Then, AFAIK, those
browsers will ask for Nynorsk, and only Nynorsk. (And unless I place
Nynorsk on top, then I cannot have OS X prefer Nynorsk interfaces of
applications.)
When I set *my* browser to prefer nn,no,nb - in that order - and visit a
web site running Apache, offering Norwegiang then it happens that I get
the page in English. This is not strange, because when I look inside
Apache 1.3 on my Mac, then it has two AddLanguage options "ready": 'no'
for Bokmål and 'nn' for nynorsk.
I can't set 'no' on top in my browser either, because then I will not be
sereved 'nn' whenever a page exist as 'nn' and 'no'.
> > Norway are "two different places" when it comes to this
> > particular issue.
>
> With the new RFC 4646 rules those "places" are called "variants",
> as you have written nn vs. nb are not really geographic regions.
>
It seems to me that the variants are mostly meant for language variants
used in sub-regions.
Nynorsk and Bokmål represents two different approaches to standardising
the Norwegian language. As such, their differences could be somewhat
compared to the fight between the two modern Greek norms. Both cover the
entire Greece.
[1] http://www.w3.org/International/articles/language-tags/#issues
> But after a decision that something is a language you are out of
> luck again, a language is no "variant".
The "decision that something is a language" did not need to mean that we
should have both nn, nb and no. We could have had only no.
Politically though, in Norway, nb and nn represents forms of the same
language. Linguistically they are perhaps different languages, worthy a
nb and a nn. Regardless, the focus is on "equal rights" and in keeping
the Norwegian language law. THus the political dimension is most important.
So what we need is something that works with the political understanding
of what Nynorsk and Bokmål are.
> The grandfathered tags
> no-bok and no-nyn were early RFC 3066 attempts in this direction. [.. snip ...]
>
An attemt of the "variant" direction? Seems more like an early attemt on
the macro language/language extension direction.
Adding more tags would be bad, you said. I wish they had had the wisdom
to say so when they proposed nb and nn, as we allready had no-nyn and
no-bok.
[...]
> > what shall we do then?
>
> Educate folks how it works. There are various participants from
> Norway in the relevant ISO committees, and it is simply not okay
> if the government site gets it wrong, this is not rocket science.
>
I have looked into this several times. And the rules confuses even me.
(My first though, when looking at the goverment web site, was to advice
them to use 'no-nn' ...)
> > perhaps Norwegian should be considerd a "Macro languge", and
> > extended-language tags be taken into use to denote each variant?
>
> Another wormhole, inactive at the moment, hopefully it will never
> go live.
It sounds from others as this is a powerless wish.
> What you would get is redundant info, "nb" is shorter
> than "no-nb". Three bytes, not the end of the world even if it
> affects billions of pages loaded billions of times for a some TB
> of unnecessary traffic.
>
I could imagine that it was thoughts like this that caused the "simple"
solution we now have.
> But it can be a disaster if the relation has to be modified later,
> because nobody is going to update the billions of pages. That is
> irrelevant for "no", but for an erroneous language subtag "zh" it
> is a major headache ("zh" roughly means China and is no language),
> so again "not making it worse" is the best solution - if dumping
> the complete scheme as failure is ruled out as option, YMMV.
>
Why would it have to be modified later? 'no' is always right - wheter I
write Nynorsk or Bokmål. The hyphotetical no-nb would not be more right
or wrong than the current 'nb' allready is. So the hyphotetical need for
re-tagging would be the same.
> > I would like to propose that 'no-nyn' and 'no-bok' was made the
> > preferred codes.
>
> Not possible under the current rules, you'd have to convince the
> IETF LTRU WG to permit this in a successor of RFC 4646, and then
> go to the languages list to request it.
>
> In theory, in practice it won't fly for obvious reasons: no-nyn
> and no-bok are no ISO 639 language codes, it is better to use
> codes working also outside of RFC 4646 tags, i.e. "nb" and "nn".
>
> If you nevertheless want a fight you could still try to persuade
> ISO 639 that "nb" and "nn" are no languages but "dialects", and
> that would solve the issue (= folks in Norway would kill you ;-)
>
It is the opposite of what you say that is the problem: Bokmål is using
'no', which is also most compatible. While Nynorsk is placed in the cold
as 'dialecet' with 'nn'. (By the way, if we wanted to be as compatible
as possible, then I think we could device 'nb' for Bokmål and 'no' for
Nynorsk, because, it seems usually 'nb' is preferred over 'no', if both
are available.)
Allready ISO 639-3 says that 'no' is a macrolanguage, and that the
concrete manifestations of 'no' are 'nno' and 'nbo'. And no one has been
killed because of this. We must also assume that ISO 639-3 and ISO 639-1
do not contradict each other.
What is important is that users get the best user experiene, and that
the tags can be used to implement the official language politics for the
Norwegian language. The tags should not promote the "ousting" of the one
or the other language from being considered Norwegian.
In that regard, guess which language tag the Norwegian goverment website
uses for Bokmål?
Yeah, right. It uses 'no'.
--
leif halvard silli
Received on Saturday, 26 April 2008 00:47:47 UTC