Another opinion (was Re: Defining the language of a document)

If you do choose to use content negotiation, I'd recommend serving a
document in English
(with status 200) if no language match was found, rather than giving a 406
error message.

Otherwise, some users who can potentially read the web site will not be able
to, as explained below.

Some rather popular browsers (e.g. Microsoft Internet Explorer)
configure the accept-language header without asking the user,
and setting it to the language of the user interface.

In addition a subset of these popular browsers (again Microsoft Internet
Explorer)
use so called "friendly HTTP error messages". These are browser generated
non informative error messages, that replace the server's messages.
It should be noted that these two habbits of popular browsers are in
contradiction
to HTTP specification.

Example Result:
1. The Italian web site has versions in Italian, English and Spanish, and
uses content negotiation.
2. An Israeli user who can, in addition to Hebrew, read both English and
Spanish tries to access the web site.
3. The Israeli user is using a Hebrew localized Windows with Internet
Explorer
4. The user gets a "friendly" error message saying "Page cannot be
displayed"
    without any explanation whatsoever. The reason being that his
accept-language header was set to "he"
   (rather than the correct "he, en, es"), and that the browser replaced a
possibly informative
   server error message (406), giving the available choices, with its own
"friendly"one.

Regards,
Nir Dagan

----- Original Message -----
From: "Karl Ove Hufthammer" <huftis@bigfoot.com>
To: <www-international@w3.org>
Cc: <g.bartol@comune.prato.it>
Sent: Thursday, December 27, 2001 4:39 PM
Subject: Re: Defining the language of a document


> 2001-12-27 14:04:38, Gabriele Bartolini
> <g.bartol@comune.prato.it>:
>
> > 1 - Using content negotiation
>
> Please do.
>
> > I could use the content negotiation, given by Apache and
> > *trust* the HTTP_ACCEPT_LANGUAGE directive sent by the user
> > agent.
>
> See <URL: http://ppewww.ph.gla.ac.uk/~flavell/www/lang-neg.html >
> and <URL: http://www.cs.tut.fi/~jkorpela/multi/ >.
>
> > By doing this, I should organize the site as Yergeau
> > and Durst propose in their article about multilingual Web,
> > by subjects and topics rather than language, naming files
> > by putting the ISO language code between file name and
> > extension.
>
> Yes.
>
> > For instance, index.it.html and index.en.html .
>
> 'index.html.it' and 'index.html.en' is slightly better.
>
> > Can you please tell me some pros and cons?
>
> See the links above.
>
> > for instance if I am in 'index.it.html' given
> > automatically back to the user by the server, and I want to
> > put a link to the english index, how could I implement it?
>
> <URL: http://www.cs.tut.fi/~jkorpela/multi/2.html >
> <URL: http://www.cs.tut.fi/~jkorpela/multi/3.html >
> <URL: http://www.cs.tut.fi/~jkorpela/multi/4.html >
>
> > Like: <a href="index.en.html">
>
> Yes.
>
> > Which one so you suggest me?
>
> Content negotiation.
>
> > 3 - How to set the main language of an HTML document
> >
> > I also have another question regarding the setting of the
> > language of an HTML document. How can I set it and through
> > which tags? Should I use the SGML doctype declaration
> > somehow?
>
> No, that's fixed to 'EN' (which is the language of the HTML
> specification).
>
> > Or should I use a generic tag with the lang
> > attribute properly set? Do you think that :
> >
> > <html lang="it">
> > [ here goes the document ]
> > </html>
> >
> > works?
>
> Yes. (AFAIK, it doesn't have any effect in current browsers,
> but it's the correct way to specifiy the language, in addition
> to the HTTP 'Content-Language' header, of course.)
>
> > I don't think it is 'a good way' of doing it.
>
> It is a good way. In *addition*, I believe Apache automatically
> sends the correct 'Content-Language' header when you use
> content negotiation.
>
> --
> Karl Ove Hufthammer
>
>

Received on Thursday, 27 December 2001 10:17:36 UTC