- From: Albert Lunde <Albert-Lunde@northwestern.edu>
- Date: Tue, 15 May 2001 22:46:02 CDT
- To: www-international@w3.org
> Some recent proposals suggest that to encode a character as Unicode, first
> convert to UTF-8 and then format each octet as %HH and send it out. My
> experience with query strings, cookies, and form data is that user agents do
> not encode first in UTF-8 before formatting octets as %HH. Rather I have
> found that the %HH format is context sensitive and is an agreement between
> the sender and the receiver. Only when a page is specifically sent down to
> a user agent in UTF-8, will the user agent return data in the %HH format in
> UTF-8. Since most html pages are still in character sets other than UTF-8,
> this means that the usage of the %HH format to mean UTF-8 is quite rare.
[...]
> Rather it seems to me that what is needed is an new HTTP encoding that
> explicitly indicates a Unicode codepoint analogous to the &#xHHHH; format
> that what invented for this very purpose for HTML. In my investigations, I
Are you talking about the encoding of a URL on the method line
of an HTTP request, the encoding of a request body, or the encoding
of a response body? These aren't always the same thing in theory
or practice. It _sounds_ like you are talking about the encoding
of URLs.
--
Albert Lunde Albert-Lunde@northwestern.edu (new address)
Albert-Lunde@nwu.edu (old address)
Received on Tuesday, 15 May 2001 23:46:37 UTC