Re: http charset labelling

Keld J|rn Simonsen (keld@dkuug.dk)
Thu, 1 Feb 1996 17:50:26 +0100


Message-Id: <199602011650.RAA22955@dkuug.dk>
From: keld@dkuug.dk (Keld J|rn Simonsen)
Date: Thu, 1 Feb 1996 17:50:26 +0100
In-Reply-To: Larry Masinter <masinter@parc.xerox.com>
To: Larry Masinter <masinter@parc.xerox.com>
Subject: Re: http charset labelling
Cc: uri@bunyip.com

Larry Masinter writes:

> It would not require changing any existing specification for someone
> to create a web server that interpreted
> 
>    http://host.dom/encoding/selector
> 
> to mean that 'encoding' was a particular encoding of the given
> selector. The web server could even return a 'Location:' header in the
> results that would give a canonical encoding, just so as not to
> confuse caches.
> 
> 'encoding' could even be UTF7, for example.

So an example could be:

     http://host.dom/UTF7/index.html   ?

The server should then recognize the first part of the
locator as the charset, and then translate the following locator
into the charset of the server. This should only be done when
the first part is one of a set of recognized charsets.

The notation should not be on business cards etc, I think we
all agree on this. It should not either be in URLs in html docs,
I also think we all agree on that.

Would there not then be a problem when the charset be automatically
inserted by the browser? The browser would not know which
servers would understand the new convention. So a lot of
havoc would be created with a browser enhanced in this way.

I think there is a clear migration path stipulated in the HTTP spec,
and that is via the major and minor numbers in the HTTP/M.N
version notation and the rules layed down there, which says that
within the same major version a server should just respond
as normal, ignoring the headers that it does not understand.
That's why I advocate a header-based solution.

keld