W3C home > Mailing lists > Public > www-tag@w3.org > February 2003

Re: URIEquivalence-15: characters in RFC 2396 (was: Re: [Minutes] 27 Jan 2003 TAG teleconf (..., IRIEverywhere-27, ...))

From: Stefan Eissing <stefan.eissing@greenbytes.de>
Date: Wed, 5 Feb 2003 16:28:19 +0100
Cc: www-tag@w3.org
To: Martin Duerst <duerst@w3.org>
Message-Id: <748D4D08-391E-11D7-A23E-00039384827E@greenbytes.de>


without bothering the least with my shallow understanding of things:

Am Dienstag, 04.02.03, um 23:52 Uhr (Europe/Berlin) schrieb Martin 

>> To come back to the one character or three question... '%7e' might be 
>> viewed
>> as 3 "URI Characters"; one "octet"; and one "original character" '~'
>> (maybe).
> Yes, exactly. The 'maybe' for '~' is quite appropriate.
> If somebody ran an http server on a computer where people
> still used e.g. the German version of ISO 646
> (see http://www.itscj.ipsj.or.jp/ISO-IR/021.pdf), then
> the original character would be a sharp-s.

But if the "%7e" is part of the query, then:


says that it is encoded US-ASCII.

So, http URIs can be encoded from an arbitrary charset, apart from
the query part?

While HTML4 is not normative for RFC 2396, it certainly reflects a way 
thinking about http uri encoding which is quite, uh, widespread nowadays
(in heads and implementations).

If this way of thinking is broken, then I would be interested to know
how a HTTP Server/CGI Util Package/Servlet Container is supposed to
translate a GET on


IMHO, "undefined" is not an acceptable answer.


Received on Wednesday, 5 February 2003 10:28:43 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 15:32:36 UTC