W3C home > Mailing lists > Public > www-tag@w3.org > February 2003

Re: "How to Compare URIs" update 3

From: Martin Duerst <duerst@w3.org>
Date: Mon, 17 Feb 2003 18:00:42 -0500
Message-Id: <4.2.0.58.J.20030217173836.0539f5c0@localhost>
To: Tim Bray <tbray@textuality.com>, WWW-Tag <www-tag@w3.org>

Hello Tim,

Some comments on this document.

- The 'Status of this document' says 'This is the second draft'.
   Guess this should be 'third'.

- The first sentence of the Introduction can be read two ways:
   "Software is commonly required to compare two URIs."
   Does this mean:
     "Comparing URIs is rarely done by hand, software is commonly used to 
do the job."
   or:
     "Software often has a need to compare two URIs."

- Please include some examples for 'Simple String Comparison'

- I have to disagree that http://dir/a and http://dir/%61 can be
   considered to be different. This is I think based on a misunderstanding
   of RFC 2396. What is true is that both 'a' and '%61' do not necessarily
   have to stand for the character 'a' on the server side, but could
   stand for a '/' character on an EBCDIC server (if that character
   is part of a path component (read file or directory name)
   rather than a path separator). Whether we have an 'a' on an
   ASCII-based server or a '/' on an EBCDIC-based server, we have
   to send the octet <61> to the server to retrieve a representation
   of that resource. To have a browser (or something similar) send
   an octet <61> (or whatever the server will interpret as such)
   to the server, both 'a' and '%61' can be used, interchangeably.

   I therefore suggest that the section on %-escaping be the first
   section of RFC2396-Sensitive Comparison, and that it nails down
   that %-equivalence (e.g. ~ == %7E == %7e) is (as currently observed)
   and be (to be specified by specifications) the minimal (baseline)
   equivalence to be respected by resolution-related operations
   (as opposed to namespace-like equivalences). (with the exception
   of reserved characters)

   I'd be glad to help with some actual text.

- The document should itself use upper case for %-escapes as it
   recommends (except for examples that are explicitly about casing).


Regards,    Martin.



At 18:23 03/02/01 -0800, Tim Bray wrote:

>I just uploaded http://www.textuality.com/tag/uri-comp-3.html with changes 
>based on feedback here and in our telecon.  Below are some of the details 
>of the update.
>
>Roy has suggested that some or all of this text may find its home in the 
>RFC2396bis, which is his current work-in-progress, which sounds sensible.
>
>Stuart argues in 
>http://lists.w3.org/Archives/Public/www-tag/2003Jan/0019.html that we rush 
>too quickly into "equivalence" without making it clear that equivalence is 
>in respect of some purpose.  He's right and I've rewritten the 
>introduction to say this; however I have not included Stuart's examples, 
>as I think the specifics are well-populated with examples below.
>
>He also agonizes at length about the %-escaping issue (as do other posters 
>to www-tag), but I'm just not gonna rewrite this any more.  I think 
>RFC2396 is first of all vague in that various reasonable people on www-tag 
>disagree on what it says, and secondly wrong in the degree of latitude it 
>offers in the char->octet mapping.  I think the current uri-comp draft is 
>as reasonable an interpretation of what it says and what you might do 
>about it as you're going to find anywhere.
>
>Dan comments at length in 
>http://lists.w3.org/Archives/Public/www-tag/2003Jan/0132.html, and the TAG 
>waded through some of them (in his absence, snicker) in 
>http://www.w3.org/2003/01/20-tag-summary.  I'll skip his editorial comments.
>
>His first comment is basically the same as Stuart's first.
>
>He also challenges the assertion that it is never possible to be sure that 
>two URIs identify "different" resources, with reference to a previous 
>posting that I can't find, sorry, so I left that as is.
>
>The TAG didn't agree with the commentary on the usage of the term "URI 
>reference".
>
>After some discussion, the TAG also decided that the /./ and /../ 
>semantics really need to apply to absolute as well as relative URIs.
>
>Finally, Paul Cotton in 
>http://lists.w3.org/Archives/Public/www-tag/2003Jan/0129.html argued that 
>the recommended best practice in %-escaping would be to use uppercase A 
>through F, and this has been adopted and explained.  -Tim
Received on Monday, 17 February 2003 19:54:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:16 GMT