W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > January to April 1996

Re: Charsets revisited

From: Gavin Nicol <gtn@ebt.com>
Date: Thu, 25 Jan 1996 09:32:53 -0500
Message-Id: <199601251432.JAA00560@ebt-inc.ebt.com>
To: masinter@parc.xerox.com
Cc: glenn@stonehand.com, frystyk@w3.org, nms@nns.ru, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
>This specification calls for the _characters_ of the form results to
>be encoded in a URL. However, the URL encoding (specified in section
>2.2 of RFC 1738 (URL)) is a way of encoding octets, not a way of
>encoding characters.
> 
>It is this disconnect that leaves the ambiguity that we're worried
>about here: when a user fills out a form and the values in that form
>are transmitted, what is the character set used in the transmission.
> 
>As such, I think this issue must be addressed in the HTML working
>group as a technical review issue for RFC 1866. As we've discussed in
>numerous other venues, there is no easy solution to the problem in
>general, although RFC 1867 (file-upload) gives some relief in many
>instances.

Given the syntax I posted earlier is still valid, it seems to be that
the best thing the HTML working group could do would be to recommend
that *all* form data be sent as a message body. This solves all the
problems *except* the problem of URI's pointing to resources that are
named in something other than ISO-8859-1 (ie. a file called
"insatsu.html" on a Japanese Windows NT machine). I have seen such
URL's, though I have not recorded them. Many people in Japan think
that it's a rather silly thing to do, but they all also acknowledge
that it will become increasingly common.
Received on Thursday, 25 January 1996 06:36:14 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:31:43 EDT