W3C home > Mailing lists > Public > www-international@w3.org > April to June 2008

Re: 2 many language tags for Norwegian

From: Andrew Cunningham <andrewc@vicnet.net.au>
Date: Sat, 26 Apr 2008 14:09:40 +1000 (EST)
Message-ID: <1627.>
To: "Leif Halvard Silli" <lhs@malform.no>
Cc: "John Cowan" <cowan@ccil.org>, www-international@w3.org

As far as I can see, it seems to keep coming back to education and
awareness raising issues.


On Sat, April 26,
2008 1:58 pm, Leif Halvard Silli wrote:
> John Cowan:  
>> Leif Halvard Silli scripsit: ...
>> >
But else, it would be very valuable to know if pages are in nynorsk or
>> > bokmål. (Google appears, in the user interface, to
be able to search
>> > only Nynorsk pages or only
Bokmål page. But it doesn't in reality make
>> > any
distinction. It could very easily do so though. It is just a
>> > of knowing which words and forms that mark out
Nynorsk vs. Bokmål.)
>> I don't think it
reveals any material nonpublic facts to say that:

>> [...] 3) Only certain existing language tags are useful in
this process
>> (for
>> example, "en" is
worth nothing,
> 'not worth nothing', I guess you
>> because a huge fraction of non-English
>> content is mechanically tagged "en" by broken HTML
composers, HTTP
>> servers, etc.);
I don't know what criteria Google uses to decide which languages are
>> cost-effective to detect.
One important criteria is certainly AdWords. If Google had offered
> AdWords in Nynorsk, then a) it would have been good for Nynorsk. b)
> would have tagged pages as Nynorsk.
>> > So when the W3C i18n article [1] said "[...] an
>> > subtag. This new subtag will go
immediately after the language subtag
>> > and before any
script tag", then this was not accurate information -
>> > at least not accurate as of today?
>> It was for many years the plan, but compelling arguments
induced the
> Such
>> WG to abandon the plan and treat all
languages as syntactically equal:
>> each language and
macrolanguage is represented directly by a 2-letter or
3-letter language subtag, and extended-language subtags will not be
>> used.
>> However, if there is a
2-letter subtag for a language or macrolanguage,
>> it will be
used in preference to the 3-letter form.  So 'nno', 'nbo',
and 'nor' will never be valid BCP 47 language subtags.
> Gotcha.
>> > Having read this,
I first thought there is no benefit for my cause in
>> > the
new extended-language subtags. But then, having thought about it,
>> I
>> > realised that by using 'nbo', then I say
that I use a sublanguage of
>> the
>> >
macrolangauge 'no'/'nor'. And ditto if I use 'nno'.
>> And you say the same thing (only conformantly) if you use 'nb'
and 'nn'.
>> > As a consequence, when using
e.g. 'nno', then a web browser asking for
>> > 'no', shold
get 'nno' if 'no' is unavailable. This is the exact
>> >
behaviour I am after. Likewise, by telling my browser to look for
>> 'nor',
>> > it should give me both nno and nbo -
and perhaps ask me to choose, if
>> > both are available.
>> Changing to different (and invalid)
> What do you mean by 'invalid'? Not 'no-nyn' and 'no-bok', I
suppose? (I
> have not advocated use of tags not part of BCP
>>  tags doesn't change the story.
>> If you want nn and nb in that order of preference, set your
>> to ask for nn, no, and nb in that order.
> Somewhere the relationship between nn, no,
nb must be better specified.
>> > And in Quebec,
Candada, then French would be the fallback for English,
>> I
>> > suppose.
>> It all depends. 
Anyhow, I was trying to use examples that aren't
>> politically
> So did I. I thought I
offered an uncontroversal example. The government
> of Quebec uses
French, I believe. And thus it uses French as
> administration
language in that state. So far, no controversy, right?
>  If a citisen reads English version goverment documens and there
> suddenly aren't a English version of the next document, then I
> that citizen would be glad to be offerd the French version
> Whether he will - or is able - to read them,
is another issue. Which
> doesn't affect the status of French as
fallback in Quebec.
> Though, still being Quebec,
reading info from the central goverment, as
> a French speaker,
you would of course be happy to receive English if a
> certain
document was unavailable in French. (But I guess this is
controversial as well?)
> But of course, it all
depends. One can always configure the browser or
> act actively
against the current status. And it often seems very
"controversal" if a majority persons in some context suddenly is
> minority.
>> > The good news for
forexample Arbereshe Albanian is that it is very
>> > simple
to configure a fallback mechanism there. I suppose there isn't
>> > even a need to tell that you want *Arbereshe* Albanian.
Following the
>> > rule of thumb to make the language tag as
short as possible, it should
>> > be enough to set the
browsers/server to accept/send out Italian and
>> >
>> Well, no.  The idea in that case is
that if you know Arbereshe Albanian,
>> you probably can't
understand standard Albanian at all, or only very
>> poorly. 
However, you are almost certainly fully bilingual in Italian.
> Well, yes, I'd say, again. Where will you
find web sites which offers
> Albanian and Italian in paralell?
OK, I forgot the obvious: Google, etc.
> True, it would not work
for those sites. So, yes, there you are right.
> Though it
depends. I know persons who prefer Swedish over Bokmål.
> leif halvard silli

Andrew Cunningham
Research and Development
State Library of Victoria

Received on Saturday, 26 April 2008 04:10:18 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:29 UTC