W3C home > Mailing lists > Public > www-international@w3.org > October to December 1996

Re: Accept-Charset support

From: Judah Eckenberg <jeckenbe@darkwing.uoregon.edu>
Date: Thu, 5 Dec 1996 13:35:24 -0800 (PST)
To: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
cc: Erik van der Poel <erik@netscape.com>, Alan Barrett/DUB/Lotus <Alan_Barrett/DUB/Lotus.LOTUSINT@crd.lotus.com>, www-international <www-international@w3.org>, bobj <bobj@netscape.com>, wjs <wjs@netscape.com>, Chris Lilley <Chris.Lilley@sophia.inria.fr>, Ed Batutis/CAM /Lotus <Ed_Batutis/CAM/Lotus@crd.lotus.com>
Message-ID: <Pine.SOL.3.91.961205132400.28004B-100000@darkwing.uoregon.edu>
On Thu, 5 Dec 1996, Martin J. Duerst wrote:

> On Wed, 4 Dec 1996, Erik van der Poel wrote:
> 
> > > Browser vendors are not keen to send a very long list of character sets
> > > accepted due to the overhead.
> > 
> > Right. This is one concern that keeps coming up over here at Netscape.
> 
> > > What do people think about this suggestion? Will it work for servers? I am
> > > really keen to give servers a chance to return UTF-8. How do servers today
> > > return UTF-8 when Accept-Charset is not generally being sent to them?
> > 
> > Servers cannot send UTF-8 to clients unless they know that the client is
> > capable of decoding it or there is a large critical mass of browsers in
> > the installed base that is known to be capable of decoding UTF-8.

[snipped]

> The structure, as I see it, has three levels:
> 
> (1) UTF-8 as an encoding that covers pretty much everything, and that
> 	we want to help getting acceptance. This group migth include
> 	some other encodings of Unicode/ISO 10646, but not too many.
> 
> (2) A list of well used and widely accepted encodings, ideally one for
> 	each "region" of the world. For Western Europe, this is
> 	iso-8859-1. We want servers to send this, and not something
> 	from the next category.
> 
> (3) All the special variants, alternative designations, and garbage
> 	"charset" parameters.

[snipped]
> So in practice, I could see the following solutions for
> Accept-Charset:
> 
> - Send UTF-8 if you can accept it, and nothing else.
> 
> - Send UTF-8 and/or a careful selection of class (2)
> 	"charset"s.
> 
	What about the possibility of the browser combining the Accept-Lang and 
Accept-Charset attributes?  The browser could look at the Accept-Lang 
specifications made by the user and from those derive a list of 
acceptable character sets.  If the user only listed Japanese in the 
Accept-Lang field, then the browser could specify ISO 2022-JP along with 
UTF-8 (if capable) in the Accept-Charset.  This would help 
guide the server on what to send and would also keep down the number of 
Accept-Charset attributes sent.  I, as a user, probably don't want to see 
a page in a language I don't understand, even if my browser can display 
it.  This would be potentially problematical for multilingual documents, 
but that might be dealt with in the Accept-Lang attribute.

	Also, does anyone have a list of the HTTPD servers that can do 
automatic translation?

	Thanks,
		Judah
_____________________________________________________________________________

	Judah Eckenberg			
	Web Master			http://babel.uoregon.edu/yamada.html
	Yamada Language Center
	University of Oregon
Received on Thursday, 5 December 1996 16:39:59 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:46 GMT