- From: Tim Bray <tbray@textuality.com>
- Date: Fri, 13 Dec 2002 07:28:15 -0800
- To: Stefan Eissing <stefan.eissing@greenbytes.de>
- Cc: WWW-Tag <www-tag@w3.org>
Stefan Eissing wrote: > RFC 2396 Ch. 2.1 > > " In the simplest case, the original character sequence contains only > characters that are defined in US-ASCII, and the two levels of mapping > are simple and easily invertible: each 'original character' is > represented as the octet for the US-ASCII code for it, which is, in > turn, represented as either the US-ASCII character, or else the "%" > escape sequence for that octet." You're saying you read this as "all characters in the ASCII range must use the ASCII codepoints for character->octet"? I guess that's plausible, but I had read 2.1 to say "there are many character->octet mappings, one of the simplest being that for ASCII chracters". And assuming you're right, it still seems like there's a window open here, if you're operating in a non-ASCII environment then the char->octet mapping is left 100% undefined, so you can't know whether %xx == %xx for all %xx > 0x7f. -Tim
Received on Friday, 13 December 2002 10:28:17 UTC