Re: Charsets revisited

>This specification calls for the _characters_ of the form results to
>be encoded in a URL. However, the URL encoding (specified in section
>2.2 of RFC 1738 (URL)) is a way of encoding octets, not a way of
>encoding characters.
>It is this disconnect that leaves the ambiguity that we're worried
>about here: when a user fills out a form and the values in that form
>are transmitted, what is the character set used in the transmission.
>As such, I think this issue must be addressed in the HTML working
>group as a technical review issue for RFC 1866. As we've discussed in
>numerous other venues, there is no easy solution to the problem in
>general, although RFC 1867 (file-upload) gives some relief in many

Given the syntax I posted earlier is still valid, it seems to be that
the best thing the HTML working group could do would be to recommend
that *all* form data be sent as a message body. This solves all the
problems *except* the problem of URI's pointing to resources that are
named in something other than ISO-8859-1 (ie. a file called
"insatsu.html" on a Japanese Windows NT machine). I have seen such
URL's, though I have not recorded them. Many people in Japan think
that it's a rather silly thing to do, but they all also acknowledge
that it will become increasingly common.

Received on Thursday, 25 January 1996 06:36:14 UTC