W3C home > Mailing lists > Public > www-international@w3.org > July to September 2003

Re: what should the charset be in the response to the server

From: Shigemichi Yazawa <yazawa@globalsight.com>
Date: Fri, 25 Jul 2003 10:41:52 -0600
Message-ID: <5e8yqmftin.wl@flatiron.globalsight.com>
To: www-international@w3.org

At Wed, 23 Jul 2003 08:49:41 +0200,
Michael Jansson wrote:
> Anyone who says it is the encoding of the page is correct but
> misleading, as the browser's user can manually decide what that encoding
> is (changing whatever was declared in the transmitted page), so a web
> server can have no certainty about the encoding used in the %hh escapes
> in a GET, which is how non-ASCII is sent.

This is a very good point. However...

> http://jetty.mortbay.com/jetty/doc/international.html
> <http://jetty.mortbay.com/jetty/doc/international.html> 
> My advice: never use GET for sending a form containing international
> characters, unless its absolutely unavoidable.
> When using PUT, use the header to find out what encoding was used.

The web page above suggests that content-type param in HTTP request
header contains character encoding information of escaped characters
(i.e. %HH). But as far as I know, a value of content-type param of a
submitted form is always application/x-www-form-urlencoded without
character encoding information. I confirmed this with Mozilla 1.0 and
IE 6.0. Does anyone know any browser that adds character encoding
information to content-type header?

My understanding is that the character encoding in POST request is,
unfortunately, as ambiguous as in GET request.

Shigemichi Yazawa
Received on Friday, 25 July 2003 12:42:29 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:23 UTC