Re: [IRI] Changed preferred case when hex-escaping IRIs from Chris Lilley on 2003-01-24 (www-international@w3.org from January to March 2003)

From: Chris Lilley <chris@w3.org>
Date: Fri, 24 Jan 2003 15:48:51 +0100
To: www-international@w3.org, Martin Duerst <duerst@w3.org>
CC: Michel Suignard <michelsu@microsoft.com>, "Ian B. Jacobs" <ij@w3.org>
Message-ID: <192354004265.20030124154851@w3.org>

On Friday, January 24, 2003, 12:39:42 AM, Martin wrote:


MD> Based on deliberations by the TAG
MD> (http://www.w3.org/2003/01/20-tag-summary), I have changed the
MD> preferred case when hex-escaping (in the process of going from
MD> an IRI to an URI) from lower case to UPPER CASE in the current
MD> internal draft. I have also tweaked examples where necessary.

MD> Any comments?       Martin.

Yes - while picking one preferred case may well help interoperability,
a belt and braces approach of picking one preferred case AND defining
the case of hex escapes (only) to be case insensitive, seems to give
the best benefit.

The rest of the characters in the IRI or URI would of course remain
case sensitive. Another way of looking at this is to say that all the
characters in the IRI or URI are case sensitive, but hex escapes are
not (a sequence of three) characters but are representations of
characters.

This is rather similar to the equivalent construct in XML, the numeric
character reference, which is also case insensitive. In XML,

&#x01BF; and &#x01bf; are the same (and so is &#447; and the actual
LATIN LETTER WYNN. [1]

I propose that in URI (as defined by the RFC that replaces RFC 2396)
HEXDIG be defined to be case insensitive.

[1] (If you are reading this email in an html archive the chances are that
the htmlification will mess things up so, with spaces added to foil
thi, here is a the same sentence again)

& # x 01BF; and & # x 01bf; are the same (and so is & # 447; and the actual
LATIN LETTER WYNN.


-- 
 Chris                            mailto:chris@w3.org

Received on Friday, 24 January 2003 09:49:02 UTC