Re: [IRI] Changed preferred case when hex-escaping IRIs

On Saturday, January 25, 2003, 2:27:30 AM, Michel wrote:

MS> That's not what I am reading in the TAG minutes.

I didn't get that from the TAG minutes, I got it from some subsequent
discussion.

MS> I read they would like the preferred case when hex-escaping to be
MS> UPPER CASE, but that is very different from being case
MS> insensitive.

Yes.

MS> If case insensitive is what the TAG wants for URI/IRI
MS> hex escaping it needs to be clearly formulated.

I have some text from marti that provides such a clear formulation.
Its not really very complicated, and it avoids an ugly gotcha that
will bite people again and again. It will also help a lot with round
tripping, and adding things onto partly-hexified URIs that majke them
into IRIs again.

MS> Michel

MS> -----Original Message-----
MS> From: Chris Lilley [mailto:chris@w3.org] 
MS> Sent: Friday, January 24, 2003 6:49 AM
MS> To: www-international@w3.org; Martin Duerst
MS> Cc: Michel Suignard; Ian B. Jacobs
MS> Subject: Re: [IRI] Changed preferred case when hex-escaping IRIs


MS> On Friday, January 24, 2003, 12:39:42 AM, Martin wrote:


MD>> Based on deliberations by the TAG 
MD>> (http://www.w3.org/2003/01/20-tag-summary), I have changed the 
MD>> preferred case when hex-escaping (in the process of going from an 
MD>> IRI to an URI) from lower case to UPPER CASE in the current internal

MD>> draft. I have also tweaked examples where necessary.

MD>> Any comments?       Martin.

MS> Yes - while picking one preferred case may well help interoperability, a
MS> belt and braces approach of picking one preferred case AND defining the
MS> case of hex escapes (only) to be case insensitive, seems to give the
MS> best benefit.

MS> The rest of the characters in the IRI or URI would of course remain case
MS> sensitive. Another way of looking at this is to say that all the
MS> characters in the IRI or URI are case sensitive, but hex escapes are not
MS> (a sequence of three) characters but are representations of characters.

MS> This is rather similar to the equivalent construct in XML, the numeric
MS> character reference, which is also case insensitive. In XML,

MS> ƿ and ƿ are the same (and so is ƿ and the actual
MS> LATIN LETTER WYNN. [1]

MS> I propose that in URI (as defined by the RFC that replaces RFC 2396)
MS> HEXDIG be defined to be case insensitive.

MS> [1] (If you are reading this email in an html archive the chances are
MS> that the htmlification will mess things up so, with spaces added to foil
MS> thi, here is a the same sentence again)

MS> & # x 01BF; and & # x 01bf; are the same (and so is & # 447; and the
MS> actual LATIN LETTER WYNN.





-- 
 Chris                            mailto:chris@w3.org

Received on Friday, 24 January 2003 20:42:37 UTC