Re: [URIEquivalence-15] %hh escapes (was: [Minutes] 16 Dec 2002 TAG teleconf)

Cross posting to uri mailing list. It seems to belong there...

Am Mittwoch, 18.12.02, um 08:10 Uhr (Europe/Berlin) schrieb Bjoern 
Hoehrmann:

>
> * Ian B. Jacobs wrote:
>>   2.2 URIEquivalence-15
>
>>    <Ian> TB: The hard problem this reveals is that a
>>    close reading of 2396 makes it clear that you can't
>>    tell whether %2a means the same other thing %2a in a
>>    different encoding. Will this be fixed, RF?
>>
>>    <Ian> RF: I'll try to clarify what it means. I have
>>    some comments on uri-comp-2. I'll send those in today.
>>
>>    <Ian> DC: You compare URIs with strcmp. It doesn't
>>    matter what the URI is. Server gets to choose what the
>>    URI string is. Only the server knows what %61 means.
>
> So section 2.4.2. of RFC 2396
>
> [...]
>    Because the percent "%" character always has the reserved purpose of
>    being the escape indicator, it must be escaped as "%25" in order to
>    be used as data within a URI.
> [...]
>
> is (among other sections) in error, since %25 could be interpreted as
> say beeing EBCDIC encoded and thus mean U+000A instead of U+0025?
>

Maybe the authors of 2396 can supply a clarification on this.

I agree that 2396 could be read in such a way that %25 stems from a
local character set and does not need to indicate the escaped char '%'.

However, I read 2396 in another way, namely, that the *intention* of
the authors was that lcoal characters with equivalences in US-ASCII need
to be %-encoded using US-ASCII octets. That would mean the % is
always encoded as %25 and that there is not other character which can
use %25 as encoding.

How can a server safely decode URIs otherwise (for example the query 
parameters
in http URIs)?

//Stefan

Received on Wednesday, 18 December 2002 04:49:26 UTC