- From: Michel Suignard <michelsu@microsoft.com>
- Date: Fri, 24 Jan 2003 17:27:30 -0800
- To: "Chris Lilley" <chris@w3.org>, <www-international@w3.org>, "Martin Duerst" <duerst@w3.org>
- Cc: "Ian B. Jacobs" <ij@w3.org>
That's not what I am reading in the TAG minutes. I read they would like the preferred case when hex-escaping to be UPPER CASE, but that is very different from being case insensitive. If case insensitive is what the TAG wants for URI/IRI hex escaping it needs to be clearly formulated. Michel -----Original Message----- From: Chris Lilley [mailto:chris@w3.org] Sent: Friday, January 24, 2003 6:49 AM To: www-international@w3.org; Martin Duerst Cc: Michel Suignard; Ian B. Jacobs Subject: Re: [IRI] Changed preferred case when hex-escaping IRIs On Friday, January 24, 2003, 12:39:42 AM, Martin wrote: MD> Based on deliberations by the TAG MD> (http://www.w3.org/2003/01/20-tag-summary), I have changed the MD> preferred case when hex-escaping (in the process of going from an MD> IRI to an URI) from lower case to UPPER CASE in the current internal MD> draft. I have also tweaked examples where necessary. MD> Any comments? Martin. Yes - while picking one preferred case may well help interoperability, a belt and braces approach of picking one preferred case AND defining the case of hex escapes (only) to be case insensitive, seems to give the best benefit. The rest of the characters in the IRI or URI would of course remain case sensitive. Another way of looking at this is to say that all the characters in the IRI or URI are case sensitive, but hex escapes are not (a sequence of three) characters but are representations of characters. This is rather similar to the equivalent construct in XML, the numeric character reference, which is also case insensitive. In XML, ƿ and ƿ are the same (and so is ƿ and the actual LATIN LETTER WYNN. [1] I propose that in URI (as defined by the RFC that replaces RFC 2396) HEXDIG be defined to be case insensitive. [1] (If you are reading this email in an html archive the chances are that the htmlification will mess things up so, with spaces added to foil thi, here is a the same sentence again) & # x 01BF; and & # x 01bf; are the same (and so is & # 447; and the actual LATIN LETTER WYNN. -- Chris mailto:chris@w3.org
Received on Friday, 24 January 2003 20:28:21 UTC