- From: Albert Lunde <Albert-Lunde@northwestern.edu>
- Date: Tue, 15 May 2001 22:46:02 CDT
- To: www-international@w3.org
> Some recent proposals suggest that to encode a character as Unicode, first > convert to UTF-8 and then format each octet as %HH and send it out. My > experience with query strings, cookies, and form data is that user agents do > not encode first in UTF-8 before formatting octets as %HH. Rather I have > found that the %HH format is context sensitive and is an agreement between > the sender and the receiver. Only when a page is specifically sent down to > a user agent in UTF-8, will the user agent return data in the %HH format in > UTF-8. Since most html pages are still in character sets other than UTF-8, > this means that the usage of the %HH format to mean UTF-8 is quite rare. [...] > Rather it seems to me that what is needed is an new HTTP encoding that > explicitly indicates a Unicode codepoint analogous to the &#xHHHH; format > that what invented for this very purpose for HTML. In my investigations, I Are you talking about the encoding of a URL on the method line of an HTTP request, the encoding of a request body, or the encoding of a response body? These aren't always the same thing in theory or practice. It _sounds_ like you are talking about the encoding of URLs. -- Albert Lunde Albert-Lunde@northwestern.edu (new address) Albert-Lunde@nwu.edu (old address)
Received on Tuesday, 15 May 2001 23:46:37 UTC