- From: Stefan Eissing <stefan.eissing@greenbytes.de>
- Date: Wed, 5 Feb 2003 16:28:19 +0100
- To: Martin Duerst <duerst@w3.org>
- Cc: www-tag@w3.org
Martin, without bothering the least with my shallow understanding of things: Am Dienstag, 04.02.03, um 23:52 Uhr (Europe/Berlin) schrieb Martin Duerst: > >> To come back to the one character or three question... '%7e' might be >> viewed >> as 3 "URI Characters"; one "octet"; and one "original character" '~' >> (maybe). > > Yes, exactly. The 'maybe' for '~' is quite appropriate. > If somebody ran an http server on a computer where people > still used e.g. the German version of ISO 646 > (see http://www.itscj.ipsj.or.jp/ISO-IR/021.pdf), then > the original character would be a sharp-s. > But if the "%7e" is part of the query, then: http://www.w3.org/TR/html4/interact/forms.html#idx-form-8 says that it is encoded US-ASCII. So, http URIs can be encoded from an arbitrary charset, apart from the query part? While HTML4 is not normative for RFC 2396, it certainly reflects a way of thinking about http uri encoding which is quite, uh, widespread nowadays (in heads and implementations). If this way of thinking is broken, then I would be interested to know how a HTTP Server/CGI Util Package/Servlet Container is supposed to translate a GET on http://example.org/search?q=a%3d%2561 IMHO, "undefined" is not an acceptable answer. //Stefan
Received on Wednesday, 5 February 2003 10:28:43 UTC