- From: Tim Bray <tbray@textuality.com>
- Date: Mon, 02 Dec 2002 07:42:07 -0800
- To: Dan Connolly <connolly@w3.org>
- Cc: WWW-Tag <www-tag@w3.org>
Dan Connolly wrote: > |Software is commonly required to compare two URIs to determine > | whether they identify the same resource. > > Really? What software has to do that? Every web brower in the planet, when it checks the URI you just typed in or clicked on against its cache. If it thinks they are the same resource (modulo expiry & so on) it doesn't dereference. Since you obviouly know about this process, I suspect I'm missing your point. > I find that software almost *never* needs to determine > whether two URIs determine the same resource. > > In my suggested rewrite of 2.2.1, I wrote: It seems to me we're saying much the same thing. > |A resource, in the Web Architecture, is an abstraction > > umm... I'm not sure what you mean by that; some resources > are quite concrete, no? We use phrases such as "a time-varying mapping yadda yadda" - yes some rsources are very concrete, but the notion of a resource (anything that can be identified, per RFC2396) is very abstract. > |Put another way, it is often possible to determine that two URIs > |are the same, but it is in general never possible to be sure > |that they are different. > > it's easy to tell if two URIs are different, OK, granted, needs editorial work. > | 1. It is in general not possible to compare relative > | URI references with any hope of correct results." > > again, you can compare URI references just fine with strcmp() > It's only when you're interested in what they point to > that you need to expand them w.r.t. a base. This whole note is about what to do when you're interested in what they point to. Once again, I'm missing your point? > |In Unicode terminology, this would be properly referred > | to as codepoint-for-codepoint comparison. > > Well, it's only codepoint-for-codepoint after you map > the charcters to codepoints; character-for-character > is just as proper, no? A character is a whatnot identified by a number (codepoint); you can use the number to look up glyphs and semantics and so on. I don't know how to do character-to-character comparison at all if I don't know what the codepoints are. > | since the Namespaces in XML recommendation specifies > | "character-for-character" comparison, it might be argued that > | since %7A and %7a must per RFC2396 represent the same character, > | XML namespaces which differ only in this respect might reasonably > | be considered equal. > > absolutely not. Let's make this a test case and make > it absolutely, perfectly clear As Misha pointed out at some length, your exegesis may be correct, but the namespaces recommendation is NOT "absolutely clear", that's one of the reasons we're spending time on this. -Tim
Received on Monday, 2 December 2002 10:42:14 UTC