Re: Accept-Charset support

At 00:28 08/12/96 +0100, Koen Holtman wrote:
>But skimming the UTF-8 specification, I gather that UTF-8 is an encoding
>mechanism, not a character set.  HTTP offers the
>Accept-Encoding/Content-encoding headers to negotiate on this.  Or does
>using Accept-Encoding only shift the problem to negotiating which part
>of UCS you can render?

UTF-8 in the charset should be taken to mean UCS in UTF-8 form. THe prefix
UNICODE-1-1 was removed to avoid version dependency which was felt not to be
necessary.

As previously mentioned, the only usefulness of negotiation is when the
server offers choice, which is more likely to be language based than
encoding based. Even today there are many pages on the web which have
buttons for alternative languages, and we can expect to achieve automation
of this process when the server offers it.

I don't really expect a Japanese server to provide a Hebrew version of its
pages however cleverly I encode my desires in accept- parameters. It may
offer an English version and I would rather get that than the Japanese. But
if it does happen to have a Hebrew version it would be nice if I got it, or
if I was French speaking a French version.

The coding negotiation is necessary for the transition period, until
everybody supports UTF-8. Its prime objective should be to avoid sending
UTF-8 to the unaware, and if what is sent is what is available today, we
would rather have it correctly labelled as 8859-2 or 1252 or whatever it
actually is so the better browsers would be able to handle it.

--

Jonathan Rosenne
JR Consulting
P O Box 33641, Tel Aviv, Israel
Phone: +972 50 246 522 Fax: +972 9 956 7353
http://ourworld.compuserve.com/homepages/Jonathan_Rosenne/

Received on Sunday, 8 December 1996 01:00:07 UTC