Re: Accept-Charset support

On Sat, 14 Dec 1996, Larry Masinter wrote:
> I'll ask again: could you demonstrate two equivalent documents that
> you might want to use content negotiation to distinguish between,
> where accept-charset (as an expression of capability) and
> accept-language (as an expression of language) are inadequate to
> distinguish between them?
> If no one has any realistic examples, perhaps the issue is moot?

Currently accept-charset is de facto used as an expression of _two_
capabilities: (1) to decode a character encoding, (2) to be able to
display (or take responsibility for) a certain character repertoire.
The second aspect is not new, is has been around as long as MIME had
a charset parameter.  It will be lost when/if everything moves to UTF-8
(and starts using accept-charset: utf-8 / charset=utf-8).

Example of a site where documents are provided in several charsets
(all for the same language):
see <URL:>.

It is of course not desirable that a server would have to deal with
such a variety of codepages/charsets, but rather the result of an
unfortunate state of nonstandardization.  Which seems to be a big
problem in that part of the world, and no doubt also in other places,
and it may last for a while.  Anyway you asked for an example where
accept-charset and accept-language together are inadequate to
distinguish between versions a server is willing to provide.
Currently they are adequate for this site; they would not if the site
chose to always use UTF-8 while still supporting the current set of
clients with their varying codepages.

It is certainly much easier to make a Web clients able to decode UTF-8
to locally available character sets, than to upgrade all client
machines so that they have fonts available to display all of the 10646
characters.  I assume the former will be done much sooner, and that
use of UTF-8 should be encouraged before all the fonts (or knowledge
to choose culturally correct replacement representations) are
available to everyone.


Received on Monday, 16 December 1996 21:22:33 UTC