Re: charset issues from Larry Masinter on 1996-12-06 (www-international@w3.org from October to December 1996)

From: Larry Masinter <masinter@parc.xerox.com>
Date: Fri, 6 Dec 1996 09:27:22 PST
To: Albert-Lunde@nwu.edu
CC: www-international@w3.org
Message-Id: <96Dec6.102722pdt."135"@palimpsest.parc.xerox.com>

We're not really in bad shape if everyone plays by these rules for
those documents that cannot be represented in latin 1:

a) EVERY client accepts UTF8. Any client may also accept other
charsets, or even _all_ charsets. There is no advantage, though, in
accepting 80%, though.

If a client knows all charsets, it just leaves out "accept-charset"
and takes what it gets. Otherwise, client sends a very short:

	Accept-charset: utf8, charset1, charset2

b) EVERY server knows how to send UTF8. The server may send whatever
the native encoding is, though, if either the client didn't set
accept-charset, or if the client included the native encoding in the
accept-charset.

If the document can be represented in latin 1, the 'accept-charset' is
just ignored, and the document is sent.

How can a client 'accept all charsets'? Well, let's make sure that
'all charsets' isn't an infinite set. We should limit charsets to
those that are registered with IANA, we should make sure there's some
kind of well-known transliteration service/table/applet that can be
dynamically downloaded for charset-to-UTF8 or charset-to-font for new
ones.

Larry

Received on Friday, 6 December 1996 13:27:28 UTC