- From: Andrew Cunningham <andrewc@vicnet.net.au>
- Date: Sat, 26 Apr 2008 14:09:40 +1000 (EST)
- To: "Leif Halvard Silli" <lhs@malform.no>
- Cc: "John Cowan" <cowan@ccil.org>, www-international@w3.org
- Message-ID: <1627.116.240.174.1.1209182980.squirrel@newmail.vicnet.net.au>
As far as I can see, it seems to keep coming back to education and awareness raising issues. Andrew On Sat, April 26, 2008 1:58 pm, Leif Halvard Silli wrote: > > John Cowan: >> Leif Halvard Silli scripsit: ... >> > But else, it would be very valuable to know if pages are in nynorsk or >> > bokmål. (Google appears, in the user interface, to be able to search >> > only Nynorsk pages or only Bokmål page. But it doesn't in reality make >> > any distinction. It could very easily do so though. It is just a >> matter >> > of knowing which words and forms that mark out Nynorsk vs. Bokmål.) >> >> I don't think it reveals any material nonpublic facts to say that: >> > >> [...] 3) Only certain existing language tags are useful in this process >> (for >> example, "en" is worth nothing, > > 'not worth nothing', I guess you meant. > >> because a huge fraction of non-English >> content is mechanically tagged "en" by broken HTML composers, HTTP >> servers, etc.); >> >> I don't know what criteria Google uses to decide which languages are >> cost-effective to detect. >> > > One important criteria is certainly AdWords. If Google had offered > AdWords in Nynorsk, then a) it would have been good for Nynorsk. b) They > would have tagged pages as Nynorsk. > >> > So when the W3C i18n article [1] said "[...] an extended-language >> > subtag. This new subtag will go immediately after the language subtag >> > and before any script tag", then this was not accurate information - >> or, >> > at least not accurate as of today? >> >> It was for many years the plan, but compelling arguments induced the >> LTRU >> > > Such as? > >> WG to abandon the plan and treat all languages as syntactically equal: >> each language and macrolanguage is represented directly by a 2-letter or >> 3-letter language subtag, and extended-language subtags will not be >> used. >> >> However, if there is a 2-letter subtag for a language or macrolanguage, >> it will be used in preference to the 3-letter form. So 'nno', 'nbo', >> and 'nor' will never be valid BCP 47 language subtags. >> > > Gotcha. > >> > Having read this, I first thought there is no benefit for my cause in >> > the new extended-language subtags. But then, having thought about it, >> I >> > realised that by using 'nbo', then I say that I use a sublanguage of >> the >> > macrolangauge 'no'/'nor'. And ditto if I use 'nno'. >> >> And you say the same thing (only conformantly) if you use 'nb' and 'nn'. >> >> > As a consequence, when using e.g. 'nno', then a web browser asking for >> > 'no', shold get 'nno' if 'no' is unavailable. This is the exact >> > behaviour I am after. Likewise, by telling my browser to look for >> 'nor', >> > it should give me both nno and nbo - and perhaps ask me to choose, if >> > both are available. >> >> Changing to different (and invalid) > > What do you mean by 'invalid'? Not 'no-nyn' and 'no-bok', I suppose? (I > have not advocated use of tags not part of BCP 47.) > >> tags doesn't change the story. >> If you want nn and nb in that order of preference, set your browser >> to ask for nn, no, and nb in that order. >> > > Somewhere the relationship between nn, no, nb must be better specified. > >> > And in Quebec, Candada, then French would be the fallback for English, >> I >> > suppose. >> >> It all depends. Anyhow, I was trying to use examples that aren't >> politically controversial. >> > > So did I. I thought I offered an uncontroversal example. The government > of Quebec uses French, I believe. And thus it uses French as > administration language in that state. So far, no controversy, right? > > If a citisen reads English version goverment documens and there > suddenly aren't a English version of the next document, then I think > that citizen would be glad to be offerd the French version instead. > > Whether he will - or is able - to read them, is another issue. Which > doesn't affect the status of French as fallback in Quebec. > > Though, still being Quebec, reading info from the central goverment, as > a French speaker, you would of course be happy to receive English if a > certain document was unavailable in French. (But I guess this is > controversial as well?) > > But of course, it all depends. One can always configure the browser or > act actively against the current status. And it often seems very > "controversal" if a majority persons in some context suddenly is the > minority. > >> > The good news for forexample Arbereshe Albanian is that it is very >> > simple to configure a fallback mechanism there. I suppose there isn't >> > even a need to tell that you want *Arbereshe* Albanian. Following the >> > rule of thumb to make the language tag as short as possible, it should >> > be enough to set the browsers/server to accept/send out Italian and >> > Albanian. >> >> Well, no. The idea in that case is that if you know Arbereshe Albanian, >> you probably can't understand standard Albanian at all, or only very >> poorly. However, you are almost certainly fully bilingual in Italian. >> > > Well, yes, I'd say, again. Where will you find web sites which offers > Albanian and Italian in paralell? OK, I forgot the obvious: Google, etc. > True, it would not work for those sites. So, yes, there you are right. > Though it depends. I know persons who prefer Swedish over Bokmål. > -- > leif halvard silli > > > -- Andrew Cunningham Research and Development Coordinator Vicnet State Library of Victoria Australia andrewc@vicnet.net.au
Received on Saturday, 26 April 2008 04:10:18 UTC