W3C home > Mailing lists > Public > www-international@w3.org > January to March 2003

RE: [IRI] Changed preferred case when hex-escaping IRIs

From: Michel Suignard <michelsu@microsoft.com>
Date: Fri, 24 Jan 2003 17:27:30 -0800
Message-ID: <9805B5E65CD0D0479D08B7EB832B369F06C642D7@red-msg-08.redmond.corp.microsoft.com>
To: "Chris Lilley" <chris@w3.org>, <www-international@w3.org>, "Martin Duerst" <duerst@w3.org>
Cc: "Ian B. Jacobs" <ij@w3.org>

That's not what I am reading in the TAG minutes. I read they would like
the preferred case when hex-escaping to be UPPER CASE, but that is very
different from being case insensitive. If case insensitive is what the
TAG wants for URI/IRI hex escaping it needs to be clearly formulated.

Michel

-----Original Message-----
From: Chris Lilley [mailto:chris@w3.org] 
Sent: Friday, January 24, 2003 6:49 AM
To: www-international@w3.org; Martin Duerst
Cc: Michel Suignard; Ian B. Jacobs
Subject: Re: [IRI] Changed preferred case when hex-escaping IRIs


On Friday, January 24, 2003, 12:39:42 AM, Martin wrote:


MD> Based on deliberations by the TAG 
MD> (http://www.w3.org/2003/01/20-tag-summary), I have changed the 
MD> preferred case when hex-escaping (in the process of going from an 
MD> IRI to an URI) from lower case to UPPER CASE in the current internal

MD> draft. I have also tweaked examples where necessary.

MD> Any comments?       Martin.

Yes - while picking one preferred case may well help interoperability, a
belt and braces approach of picking one preferred case AND defining the
case of hex escapes (only) to be case insensitive, seems to give the
best benefit.

The rest of the characters in the IRI or URI would of course remain case
sensitive. Another way of looking at this is to say that all the
characters in the IRI or URI are case sensitive, but hex escapes are not
(a sequence of three) characters but are representations of characters.

This is rather similar to the equivalent construct in XML, the numeric
character reference, which is also case insensitive. In XML,

&#x01BF; and &#x01bf; are the same (and so is &#447; and the actual
LATIN LETTER WYNN. [1]

I propose that in URI (as defined by the RFC that replaces RFC 2396)
HEXDIG be defined to be case insensitive.

[1] (If you are reading this email in an html archive the chances are
that the htmlification will mess things up so, with spaces added to foil
thi, here is a the same sentence again)

& # x 01BF; and & # x 01bf; are the same (and so is & # 447; and the
actual LATIN LETTER WYNN.


-- 
 Chris                            mailto:chris@w3.org
Received on Friday, 24 January 2003 20:28:21 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:59 GMT