W3C home > Mailing lists > Public > www-html@w3.org > February 2006

Re: hreflang

From: Laurens Holst <lholst@students.cs.uu.nl>
Date: Sat, 04 Feb 2006 23:39:37 +0100
Message-ID: <43E52D29.1000407@students.cs.uu.nl>
To: www-html@w3.org

Jukka K. Korpela schreef:
>>> Accept-Language would be one of the most important features of an 
>>> user agent *if* it contained real information. 
>> It does.
> Pardon? We know that it horrendously often contains _wrong_ 
> information. (Or did you mean to write "It is", saying that 
> Accept-Language is etc.?
> I see no reason to take away the conditionality.)

I don’t see how it contains wrong information.

>> It is frequently used by websites such as hotmail, Google, 
>> mozilla-europe.org and other large sites to present the user with a 
>> localised version of their website.
> Google, for one, is a _bad_ example, since it applies a confused 
> mixture of heuristics in deciding on the customization. Many people 
> claim that they cannot get to google.com but get thrown to google.de, 
> google.fi, etc., according to some guesswork.

Ok, Google was a bad example, you’re the fourth telling me that now. I’m 
just writing an email here, and I can’t smell how Google works exactly 
internally, nor do extensive research.

> Anyway, the sites that utilize language negotiation are a small 
> minority, though they contain a few important sites.

Hotmail, yes, not quite an insignificant site. I’ve just set my 
accept-language to Japanese and browsed around a few sites, and found 
that www.nero.com also uses the information. Actually there probably 
aren’t that many sites I know about which would have a Japanese 
localisation, so maybe it would be better if I e.g. had set it to German 
(just read that Lachlan tried that, and found that the information is 
used by Google after all, albeit not as the only mechanism).

The thing is, because there are a few important sites among them, that 
ensures that the language information isn’t entirely *wrong*. E.g. a 
Dutch user who gets presented Hotmail in Turkish will be quick to find 
out how to change his language settings. Similarly, a Dutch person who 
doesn’t understand English well so prefers Dutch and knows that a friend 
uses Hotmail in Dutch will also start looking.

If Accept-Language were an obscure HTTP feature, then yes, the 
information it conveys would be totally unreliable. But it’s not, it’s 
being used, and because of that (in addition to the quite good 
auto-configuration) the information stays reasonably reliable.

Finally, one observation: if you look at the number of sites as a whole, 
there aren’t that many multi-lingual sites anyway. And those who are, 
very often are really not language-dependant but really 
location-dependant! E.g. eBay uses a .nl domain because eBay.nl is 
operating on the Dutch market. Similarly, the sites of hardware 
manufacturers which are often presented in different languages really 
are about location, because their products and terms are different for 
different markets, not languages.

> Consider, for example, the fact that there is nothing resembling 
> language negotiation on the European Union site, despite the obvious needs

Saying that there are sites which do not use the language negotiation 
capabilities doesn’t really matter. But maybe someone ought to inform 
them of the possibility. On the other hand, me saying that there *are* a 
number of very popular sites which do use the information says a lot 
about the reliability of the information! If it weren’t reliable enough, 
it would be in their own interest to use a different method, e.g. a 
GeoIP database like Google apparantly does. But it seems it is robust 
enough to be used in business-critical applications.

> - it's often not just a matter of serving content in the user's native 
> language but also choosing between available versions according to 
> users' _other_ preferences (e.g., between French, English, and German, 
> if these are the only options).
> However, as Mikko mentioned, the real problem is that user agents do 
> not send adequate information. The _only_ way to achieve that is to 
> make user agents prompt for language information, i.e. to ask users to 
> specify their language preferences, in an easy way, _and_ to make the 
> preferences easily changeable (which is important especially on public 
> and classroom computers when user ids are not used).

No-one said that making the preferences easily changeable wasn’t 
important. Aside from the fact that the option isn’t exactly hard to 
change in the browser itself, I’ve seen it mentioned in several 
documents and accessibility guidelines that it is recommended to offer 
the ability to change the language on the website that employs it as 
well. I may even have read it in some standard, but if so I do not 
recall which. It is also a practical requirement, which you will 
naturally come to think of when implementing a site using 
accept-language, and if you don’t offer such a thing there are likely 
users who will complain.

At the least, it’s a very good way to make a first educated guess when 
you have a multilingual site, instead of just serving it to the user in 
English and offering a language-selection dropdown (which is totally 
meaningless if the user doesn’t understand English). Or list all 
languages on your frontpage, like Wikipedia can afford to do.

>> Its default value is depending on the browsers locale,
> That's a big part of the problem.

I really don’t see how. In fact, it is one of the reasons why the 
information is quite reliable. If I have an English Firefox installed, 
it will tell the server ‘well, my user understands English’. If it’s 
Dutch, it will say ‘my user understands Dutch’. Based on that the server 
can make an informed decision.

Whether the user is natively Dutch or not isn’t really important in most 
scenarios. He chose English to be the preferred language for his browser 
instead of Dutch, so it is really likely that he also prefers English in 
general. And if that’s not the case, he can change it.

>> which is pretty accurate as when the browsers locale doesnt match the 
>> users, it will be difficult for him to use the browser.
> That's not the point. Many people use English-language browsers and 
> systems for several reasons - for example, because a browser of their 
> choice exists only in an English version, or because its localization
> is awful (wrong translations, etc.).

So? If they don’t like being presented with English localisations of 
certain international websites, they can change their preferences in 
that regard. But obviously if they prefer the English localisation over 
the Dutch one (which, in the case of Firefox, happens to be really 
really awesomely great), they won’t have any problems with reading 
websites in English.

> Besides, even if the language of your browser happens to be your 
> native language, what about all the _other_ languages you might know? 
> In the WWW context, even languages you know just a little are 
> important in the preferences.

What you are looking for is a higher level of detail. What are *all* the 
languages that the user knows, and to what extent. Although interesting, 
it is more difficult to ‘automatically’ configure and thus less likely 
to happen. But it’s not as if it’s hard to specify for the user; the 
setting is e.g. easily found in both the Internet Explorer, Firefox and 
Opera browsers. If even more sites would use it, I’m sure the accuracy 
would also increase. But there aren’t that many multilingual sites around.

Anyways, wanting a higher level of detail is fine and all, but it 
doesn’t mean the current mechanism is flawd.

If you say there are in absolute numbers still a lot of Dutch people 
using Hotmail with the English locale while they would prefer Dutch, I 
believe that is true. However relatively seen I think that number is 
fairly low, and I don’t think those people have problems with using 
Hotmail, as users will not have an OS or browser installed in a language 
that they do not understand.

I say that it is perfectly possible to determine a site’s locale based 
on this header that is sent by user agents, and that if the default 
setting was wrong initially, there is a big chance that the user has 
already changed it for the better, and if not you can tell him how to do so.

Manuel Strehl wrote:
> I think it's kind of a vicious circle. Users see no reason for using 
> the Accept-Language capabilities of their browsers (well, actually I 
> think more than 80% don't even know, that they can interact with 
> websites this way...) as long as web designers don't make use out of 
> this header, while the web designers don't feel like having to develop 
> content negotiation if there's noone out there to appreciate it. 

But they do! Hotmail, the no. 1 free web-based mail provider does. There 
are millions visiting and using that service on a daily basis.


Ushiko-san! Kimi wa doushite, Ushiko-san!!
Laurens Holst, student, university of Utrecht, the Netherlands.
Website: www.grauw.nl. Backbase employee; www.backbase.com.
Received on Saturday, 4 February 2006 22:41:45 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:06:12 UTC