- From: Stefan Eissing <stefan.eissing@greenbytes.de>
- Date: Fri, 13 Dec 2002 10:11:53 +0100
- To: Tim Bray <tbray@textuality.com>
- Cc: WWW-Tag <www-tag@w3.org>
Am Freitag, 13.12.02, um 09:08 Uhr (Europe/Berlin) schrieb Tim Bray: > > For your pleasure at http://www.textuality.com/tag/uri-comp-2.html - > incorporating all the really excellent editorial feedback and with the > benefit of a transcontinential airline flight to stamp out > infelicities of language and thought. Excellent read. > By the way, I'm really looking for input on what I'll call the %61 > issue; having trouble believing that RFC2396 really is saying what I > think it's saying. I bounced it off John Cowan at the XML conference > and he seemed to think I might be right, which is upsetting. -Tim > RFC 2396 Ch. 2.1 " In the simplest case, the original character sequence contains only characters that are defined in US-ASCII, and the two levels of mapping are simple and easily invertible: each 'original character' is represented as the octet for the US-ASCII code for it, which is, in turn, represented as either the US-ASCII character, or else the "%" escape sequence for that octet." I think that pretty much defines that 'a' is either 'a' or '%61' no matter the charset applied elsewhere. Corrolary, on EBCDIC systems '/' would have to be placed as '/' or '%2f' in URIs. Taking this further: when using character encoding 'X' in URIs, one has to make sure that octets with mappings defined in US-ASCII (<= 0x7f) denote the same characters as US-ASCII does. Otherwise there would be more than 1 character with the same %-encoding in URIs. That would forbid the use of EBCDIC, among others, as base for URI character encoding. Best Regards, Stefan
Received on Friday, 13 December 2002 04:12:22 UTC