W3C home > Mailing lists > Public > www-html@w3.org > February 2006

Re: hreflang

From: Laurens Holst <lholst@students.cs.uu.nl>
Date: Sun, 05 Feb 2006 15:40:21 +0100
Message-ID: <43E60E55.8050902@students.cs.uu.nl>
To: W3C HTML List <www-html@w3.org>

Jukka K. Korpela schreef:
>>>>> Accept-Language would be one of the most important features of an 
>>>>> user agent *if* it contained real information. 
>>>>
>>>> It does.
>>>
>>> Pardon? We know that it horrendously often contains _wrong_ 
>>> information. (Or did you mean to write "It is", saying that 
>>> Accept-Language is etc.?
>>> I see no reason to take away the conditionality.)
>>
>> I dont see how it contains wrong information.
>
> Sorry, but you didn't answer my question.

I meant to say what I said: it does contain real information.

> Anyway, we know that the
> Accept-Language header very often contains wrong information, since it 
> tends to reflect browser defaults, not user choices. The defaults 
> might be completely absurd, like en-US as the only language set, 
> therefore informing that US English is the _only_ language that the 
> user understands at all.

But, the browser defaults aren’t absurd, they’re based on information 
that is available without explicitly prompting the user. The information 
may not be complete, and deducing that English is the only language that 
the user understands is not something that you should do. The only thing 
Accept-Language does, imho, is indicate that English is a language that 
is understood and is thus preferred over all the other languages it 
doesn’t list.

>> Ok, Google was a bad example, youre the fourth telling me that now.
>
> It was your _best_ example in the sense that Google is used more than 
> the other examples combined, and Google _could_ in fact make a lot of 
> use about language preferences: in its own interface (which exists in 
> many languages, though partly in very poor translations), in 
> restricting searches to certain languages, in recommendation automatic 
> translations, etc. A typical "multilingual" site is just a collection 
> of pages in two or a few languages, and content negotiation would just 
> speed things up.

Ok. Well, good to know then that Google does use the Accept-Language 
header after all.

I think Hotmail is a good second, by the way, in terms of importance.

>> Ive just set my accept-language to Japanese and browsed around a few 
>> sites, and found that www.nero.com also uses the information.
>
> Really? I just visited http://www.nero.com on a browser with some 
> language preferences that do not include English at all, and yet the 
> site presents itself to me in English, with no indication of any kind 
> of an error.

It seems common sense to me to fall back to English if the language 
isn’t supported, and not throw an error. The fact that Accept-Language 
doesn’t list English doesn’t mean that the user doesn’t understand it.

> It's surely better than the usual "Not Accepted" error message, which 
> adds insult to injury, but neither is it the way language negotiation 
> is supposed to work. Adding Japanese to the preferences does not help 
> - unless I make it the _first_ in the list. No, this is _not_ how HTTP 
> language negotiation is supposed to be used!

Obviously if you have English before Japanese in the list, it’s going to 
choose English.

Additionally, there may be errors in Nero its language detection method 
(e.g. if you put Vietnamese which they probably do not support before 
Japanese, and having no English), but that doesn’t discount the fact 
that they *are* using the information, albeit not 100% in the correct 
manner in some less-common scenarios.

>> The thing is, because there are a few important sites among them, 
>> that ensures that the language information isnt entirely *wrong*.
>
> Configuring your browser to send your language preferences isn't 
> wrong, of course (even if you don't configure your full preferences - 
> but saying that you only know English when you in fact know other 
> languages as well might well be worse than not saying anything).

I disagree, take this practical example:

Ass: I’m French and do not understand English.
1. My browser does not send an Accept-Language at all. I am presented 
with a page in English, because it is the most common language our 
there, and some hopefully-obvious to someone who doesn’t understand 
English language navigation mechanism. E.g. mozilla.com lists these 
languages at the bottom, so you have to scroll down to find out the page 
is also offered in other languages.
2. My browser is automatically configured to send an Accept-Language: fr 
header, and I am presented with a page in French.

Another example,
Ass: I’m Dutch and do not understand French, but do understand English. 
I am visiting a page that is available in French, English and Dutch, 
with the French as its main audience. This is e.g. a likely scenario for 
a Belgian company.
1. My browser does not send an Accept-Language header at all. I am 
presented with a page in French, because it is the main audience. 
Hopefully I won’t immediately disregard the page as ‘I can’t read this’ 
but look for a translation link.
2. My browser is automatically configured to either send an 
Accept-Language: en or Accept-Language: nl header, depending on my 
OS/browser, and I am presented with a page in either English or Dutch, 
both of which I understand.

I think you shouldn’t consider Accept-Language as saying anything about 
the languages it doesn’t mention.

> Sending wrong preferences without even asking the user is entirely wrong.

It’s not wrong. The level of detail isn’t very high, if that is what you 
mean.

>> Saying that there are sites which do not use the language negotiation 
>> capabilities doesnt really matter. But maybe someone ought to inform 
>> them of the possibility.
>
> If they understood the issue, they would probably answer something 
> like the following: Yes, we could make the EU site send different 
> language versions according to Accept-Language. But then millions of 
> people who speak English very poorly, or who speak French or German or 
> Slovak much better than English, would always get the English version.

Why do you assume that the auto-configuration doesn’t work? Why think 
that millions of French or German or Slovak people speak English poorly, 
but do not have their OS or their browser in their native language?

> Yes, we could include links to different versions, but people might 
> not notice them.

Why would they not notice them, it is just a matter of placement. E.g. 
in the current EU site, the language selection is available in a very 
obvious and visible location: see the top banner at 
http://europa.eu.int/index_en.htm. So I don’t see how this case is very 
different from the current situation.

> They'd just consume the English versions if they know any English at all.
> (This isn't quite what _I_ would respond, but it's a reasonable 
> position.)

Or they could put the languages that Accept-Header mentions on top, with 
stronger emphasis, and list the other available languages below. That 
would make selecting a language much more convenient. Just now, it took 
me over 5 seconds to find the Dutch version’s link among all those 
languages, which is quite a bad user-experience if you ask me, and it 
could be a lot better had they used the Accept-Language header.

>> What you are looking for is a higher level of detail. What are *all* 
>> the languages that the user knows, and to what extent.
>
> _That_ is what HTTP language negotiation is about, at the user end.
> If you didn't see this, you have missed the point. It's not a "higher
> level of detail" but the essence.

In an ideal world, that would be the case. However it isn’t like that, 
for obvious reasons, and it doesn’t mean that the current level of 
detail is unusable. You are looking for details like ‘what language is 
best for the user’, while the current level of detail often only says 
‘this language is understood by the user’.

It can be used right now without problems, and if because of some 
browser configuration policy decision the information it provides 
becomes more detailed, good for them, there will be less cases where the 
user is given a site in a language that he understands, but is not the 
language he understands best. Like Dutch people getting the Hotmail UI 
in English.

For example, a Japanese company could give the user a Japanese version 
when the Accept-Language contains ‘ja’, and otherwise give an English 
website, or at least add a big link saying ‘click here for the English 
version of this site’ (note in this example that it’s not all or 
nothing, it could just mean giving the language changing mechanism more 
visibility, and give a meaningful default choice).

>> There are millions visiting and using that service on a daily basis.
>
> Are you saying that a million flies can't be wrong?

Why do you think the use case of Hotmail is insignificant? If a system 
is used on a website that is the no. 1 webmail provider in many 
countries including a lot of non-English speaking countries, and used by 
all those people on a daily basis (like Google), then yes I do think 
that says something meaningful about its reliability.

In fact, the ease with which you disregard this important use case 
strikes me as a bit strange.

Anyways, to put an end to this, I think that the Accept-Language header 
conveys real information, albeit in a lower level of detail that would 
be possible, because there are limits to what the auto-configuration can 
do. I am all for having the browsers present the user with a language 
configuration option in a non-obtrusive manner, in order to improve the 
quality of this information. (E.g. on the initial page after an install, 
which they can either choose to fill in or ignore and accept the 
defaults, or with an ‘advanced configuration’ choice somewhere during 
the installation process.)

Nevertheless, this lower detail of information does not make it useless, 
and the fact that there are convincing use cases such as Hotmail and to 
a lesser extent Google and Nero out there shows that you can use it in a 
reliable manner right now.

As a final example, I created a website for a Dutch homecomputer fair, 
which is bilingual in both English and Dutch. Everyone who does not have 
‘nl’ in their Accept-Language header is presented with the English 
website, and if it indicates that Dutch is the preferred language then 
it’s sent in Dutch. This works really well, the Dutch people get a Dutch 
version, and the others get an English one. Of course there are Dutch 
people getting an English version because they chose not to use a Dutch 
OS or browser (due to the technical nature of the subject, this is 
higher than average), but in those cases there is no harm either, 
because those people obviously understand English, and even might prefer 
it for some oddball reason. As long as the people who *don’t* understand 
English at all but do understand Dutch get the Dutch version.


~Grauw

-- 
Ushiko-san! Kimi wa doushite, Ushiko-san!!
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Laurens Holst, student, university of Utrecht, the Netherlands.
Website: www.grauw.nl. Backbase employee; www.backbase.com.
Received on Sunday, 5 February 2006 14:42:27 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 27 March 2012 18:16:05 GMT