Re: Accept-Charset support from Klaus Weide on 1996-12-17 (www-international@w3.org from October to December 1996)

From: Klaus Weide <kweide@tezcat.com>
Date: Mon, 16 Dec 1996 23:20:19 -0600 (CST)
To: Larry Masinter <masinter@parc.xerox.com>
cc: www-international@w3.org, http-wg@cuckoo.hpl.hp.com
Message-ID: <Pine.SUN.3.95.961216215418.29478C-100000@huitzilo.tezcat.com>

On Sat, 14 Dec 1996, Larry Masinter wrote:
> 
> I'll ask again: could you demonstrate two equivalent documents that
> you might want to use content negotiation to distinguish between,
> where accept-charset (as an expression of capability) and
> accept-language (as an expression of language) are inadequate to
> distinguish between them?
> 
> If no one has any realistic examples, perhaps the issue is moot?

Currently accept-charset is de facto used as an expression of _two_
capabilities: (1) to decode a character encoding, (2) to be able to
display (or take responsibility for) a certain character repertoire.
The second aspect is not new, is has been around as long as MIME had
a charset parameter.  It will be lost when/if everything moves to UTF-8
(and starts using accept-charset: utf-8 / charset=utf-8).

Example of a site where documents are provided in several charsets
(all for the same language):
see <URL: http://www.fee.vutbr.cz/htbin/codepage>.

It is of course not desirable that a server would have to deal with
such a variety of codepages/charsets, but rather the result of an
unfortunate state of nonstandardization.  Which seems to be a big
problem in that part of the world, and no doubt also in other places,
and it may last for a while.  Anyway you asked for an example where
accept-charset and accept-language together are inadequate to
distinguish between versions a server is willing to provide.
Currently they are adequate for this site; they would not if the site
chose to always use UTF-8 while still supporting the current set of
clients with their varying codepages.

It is certainly much easier to make a Web clients able to decode UTF-8
to locally available character sets, than to upgrade all client
machines so that they have fonts available to display all of the 10646
characters.  I assume the former will be done much sooner, and that
use of UTF-8 should be encouraged before all the fonts (or knowledge
to choose culturally correct replacement representations) are
available to everyone.

  Klaus

Received on Tuesday, 17 December 1996 00:20:28 UTC