- From: Joris Dobbelsteen <joris.dobbelsteen@mail.com>
- Date: Mon, 11 Jun 2001 00:21:01 +0200
- To: "'Peter W'" <peterw@usa.net>
- Cc: "WWW WG (E-mail)" <http-wg@cuckoo.hpl.hp.com>
They are still used, I think.... (e.g. IE) The specification expects ASCII characters, but if UTF-8 is compatible with ASCII and the spec this can be used. >-----Original Message----- >From: Peter W [mailto:peterw@usa.net] >Sent: Friday, 08 June 2001 1:22 >To: http-wg@cuckoo.hpl.hp.com >Subject: %NN encoding, request/response headers, UTF-8 ? > > >I'm curious > - whether Unicode characters with values 0x100 and greater >are allowed in > request headers (especially the request line) No, solely ASCII, according to the spec. It states ASCII characters, with a range from 0x00 to 0xFF. Depending on what you are entering, there may be some more limitations (see spec). > - if so, if UTF-8 encoding is allowed\ Don't know about UTF-8, I thought IE could use it to encode the URL. > - how a client indicates to the server that it's using UTF-8 > - how an HTTP server application decides how to interpret > hex-encoded information, e.g. is %C3%B1 encoding two characters, > or the UTF-8 encoding for the single character "ñ" > - how/if a server might use UTF-8 in its response headers > >It looks like any content that is sent with MIME headers >(e.g., an object >sent by the HTTP server) could be announced with a charset >value indicating >UTF-8 encoding, but that headers (request or response) are >only expected to >contain characters 0x00 -> 0xFF. Yet I don't see this clearly stated. > >It seems fairly clear, though, that double-byte character sets >(e.g., 16 >bits for each character regardless of its value) should not be used in >either request or response headers. Right? Seems like it. It might, however, be possible to encode them so they comply with the spec. Still, I'm not sure about this. > >I appreciate any light you may be able to shed on this topic. > >-Peter >http://www.tux.org/~peterw/ > - Joris >
Received on Sunday, 10 June 2001 23:20:22 UTC