W3C home > Mailing lists > Public > www-tag@w3.org > December 2002

Re: Draft 2 of "How to Compare URIs"

From: Stefan Eissing <stefan.eissing@greenbytes.de>
Date: Fri, 13 Dec 2002 10:11:53 +0100
Cc: WWW-Tag <www-tag@w3.org>
To: Tim Bray <tbray@textuality.com>
Message-Id: <EBE41D02-0E7A-11D7-BAFF-00039384827E@greenbytes.de>

Am Freitag, 13.12.02, um 09:08 Uhr (Europe/Berlin) schrieb Tim Bray:

>
> For your pleasure at http://www.textuality.com/tag/uri-comp-2.html - 
> incorporating all the really excellent editorial feedback and with the 
> benefit of a transcontinential airline flight to stamp out 
> infelicities of language and thought.

Excellent read.

> By the way, I'm really looking for input on what I'll call the %61 
> issue; having trouble believing that RFC2396 really is saying what I 
> think it's saying.  I bounced it off John Cowan at the XML conference 
> and he seemed to think I might be right, which is upsetting. -Tim
>
RFC 2396 Ch. 2.1

" In the simplest case, the original character sequence contains only 
characters that are defined in US-ASCII, and the two levels of mapping 
are simple and easily invertible: each 'original character' is 
represented as the octet for the US-ASCII code for it, which is, in 
turn, represented as either the US-ASCII character, or else the "%" 
escape sequence for that octet."

I think that pretty much defines that 'a' is either 'a' or '%61' no 
matter the charset applied
elsewhere. Corrolary, on EBCDIC systems '/' would have to be placed  as 
'/' or '%2f' in URIs.

Taking this further: when using character encoding 'X' in URIs, one has 
to make sure that
octets with mappings defined in US-ASCII (<= 0x7f) denote the same 
characters as US-ASCII
does. Otherwise there would be more than 1 character with the same 
%-encoding in URIs.
That would forbid the use of EBCDIC, among others, as base for URI 
character encoding.

Best Regards, Stefan
Received on Friday, 13 December 2002 04:12:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:14 GMT