Re: Draft 2 of "How to Compare URIs"

Am Freitag, 13.12.02, um 09:08 Uhr (Europe/Berlin) schrieb Tim Bray:

> For your pleasure at - 
> incorporating all the really excellent editorial feedback and with the 
> benefit of a transcontinential airline flight to stamp out 
> infelicities of language and thought.

Excellent read.

> By the way, I'm really looking for input on what I'll call the %61 
> issue; having trouble believing that RFC2396 really is saying what I 
> think it's saying.  I bounced it off John Cowan at the XML conference 
> and he seemed to think I might be right, which is upsetting. -Tim
RFC 2396 Ch. 2.1

" In the simplest case, the original character sequence contains only 
characters that are defined in US-ASCII, and the two levels of mapping 
are simple and easily invertible: each 'original character' is 
represented as the octet for the US-ASCII code for it, which is, in 
turn, represented as either the US-ASCII character, or else the "%" 
escape sequence for that octet."

I think that pretty much defines that 'a' is either 'a' or '%61' no 
matter the charset applied
elsewhere. Corrolary, on EBCDIC systems '/' would have to be placed  as 
'/' or '%2f' in URIs.

Taking this further: when using character encoding 'X' in URIs, one has 
to make sure that
octets with mappings defined in US-ASCII (<= 0x7f) denote the same 
characters as US-ASCII
does. Otherwise there would be more than 1 character with the same 
%-encoding in URIs.
That would forbid the use of EBCDIC, among others, as base for URI 
character encoding.

Best Regards, Stefan

Received on Friday, 13 December 2002 04:12:22 UTC