W3C home > Mailing lists > Public > www-international@w3.org > October to December 2001

RE: Another opinion (was Re: Defining the language of a document)

From: Jay Allen <jayallen@microsoft.com>
Date: Thu, 27 Dec 2001 09:49:55 -0800
Message-ID: <1917990039F03747BE329AAFC8222B57038A480A@red-msg-04.redmond.corp.microsoft.com>
To: "Nir Dagan" <nir@nirdagan.com>, "Karl Ove Hufthammer" <huftis@bigfoot.com>, <www-international@w3.org>
Cc: <g.bartol@comune.prato.it>
Where does the HTTP spec forbid selecting a default accept-language? The only paragraph I found was the following from 14.4, which only dictates that a facility be made available to change it (which IE does provide):
 

   As intelligibility is highly dependent on the individual user, it is
   recommended that client applications make the choice of linguistic
   preference available to the user. If the choice is not made
   available, then the Accept-Language header field MUST NOT be given in
   the request.
 
The spec does not say that the user needs to be queried interactively at the time of document retrieval (how would that work, anyway?). 
 
If the header is absent, the spec dictates that all language choices are equally acceptable. Picking a default based on system language seems better than a random guess on the server's part.

-J-

	-----Original Message----- 
	From: Nir Dagan [mailto:nir@nirdagan.com] 
	Sent: Thu 12/27/2001 7:16 AM 
	To: Karl Ove Hufthammer; www-international@w3.org 
	Cc: g.bartol@comune.prato.it 
	Subject: Another opinion (was Re: Defining the language of a document)
	
	

	If you do choose to use content negotiation, I'd recommend serving a
	document in English
	(with status 200) if no language match was found, rather than giving a 406
	error message.
	
	Otherwise, some users who can potentially read the web site will not be able
	to, as explained below.
	
	Some rather popular browsers (e.g. Microsoft Internet Explorer)
	configure the accept-language header without asking the user,
	and setting it to the language of the user interface.
	
	In addition a subset of these popular browsers (again Microsoft Internet
	Explorer)
	use so called "friendly HTTP error messages". These are browser generated
	non informative error messages, that replace the server's messages.
	It should be noted that these two habbits of popular browsers are in
	contradiction
	to HTTP specification.
	
	Example Result:
	1. The Italian web site has versions in Italian, English and Spanish, and
	uses content negotiation.
	2. An Israeli user who can, in addition to Hebrew, read both English and
	Spanish tries to access the web site.
	3. The Israeli user is using a Hebrew localized Windows with Internet
	Explorer
	4. The user gets a "friendly" error message saying "Page cannot be
	displayed"
	    without any explanation whatsoever. The reason being that his
	accept-language header was set to "he"
	   (rather than the correct "he, en, es"), and that the browser replaced a
	possibly informative
	   server error message (406), giving the available choices, with its own
	"friendly"one.
	
	Regards,
	Nir Dagan
	
	----- Original Message -----
	From: "Karl Ove Hufthammer" <huftis@bigfoot.com>
	To: <www-international@w3.org>
	Cc: <g.bartol@comune.prato.it>
	Sent: Thursday, December 27, 2001 4:39 PM
	Subject: Re: Defining the language of a document
	
	
	> 2001-12-27 14:04:38, Gabriele Bartolini
	> <g.bartol@comune.prato.it>:
	>
	> > 1 - Using content negotiation
	>
	> Please do.
	>
	> > I could use the content negotiation, given by Apache and
	> > *trust* the HTTP_ACCEPT_LANGUAGE directive sent by the user
	> > agent.
	>
	> See <URL: http://ppewww.ph.gla.ac.uk/~flavell/www/lang-neg.html >
	> and <URL: http://www.cs.tut.fi/~jkorpela/multi/ >.
	>
	> > By doing this, I should organize the site as Yergeau
	> > and Durst propose in their article about multilingual Web,
	> > by subjects and topics rather than language, naming files
	> > by putting the ISO language code between file name and
	> > extension.
	>
	> Yes.
	>
	> > For instance, index.it.html and index.en.html .
	>
	> 'index.html.it' and 'index.html.en' is slightly better.
	>
	> > Can you please tell me some pros and cons?
	>
	> See the links above.
	>
	> > for instance if I am in 'index.it.html' given
	> > automatically back to the user by the server, and I want to
	> > put a link to the english index, how could I implement it?
	>
	> <URL: http://www.cs.tut.fi/~jkorpela/multi/2.html >
	> <URL: http://www.cs.tut.fi/~jkorpela/multi/3.html >
	> <URL: http://www.cs.tut.fi/~jkorpela/multi/4.html >
	>
	> > Like: <a href="index.en.html">
	>
	> Yes.
	>
	> > Which one so you suggest me?
	>
	> Content negotiation.
	>
	> > 3 - How to set the main language of an HTML document
	> >
	> > I also have another question regarding the setting of the
	> > language of an HTML document. How can I set it and through
	> > which tags? Should I use the SGML doctype declaration
	> > somehow?
	>
	> No, that's fixed to 'EN' (which is the language of the HTML
	> specification).
	>
	> > Or should I use a generic tag with the lang
	> > attribute properly set? Do you think that :
	> >
	> > <html lang="it">
	> > [ here goes the document ]
	> > </html>
	> >
	> > works?
	>
	> Yes. (AFAIK, it doesn't have any effect in current browsers,
	> but it's the correct way to specifiy the language, in addition
	> to the HTTP 'Content-Language' header, of course.)
	>
	> > I don't think it is 'a good way' of doing it.
	>
	> It is a good way. In *addition*, I believe Apache automatically
	> sends the correct 'Content-Language' header when you use
	> content negotiation.
	>
	> --
	> Karl Ove Hufthammer
	>
	>
	
	
Received on Thursday, 27 December 2001 12:50:28 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:58 GMT