RE: consistency between namspaces 1.1 and URI spec (RFC2396-bis) from Williams, Stuart on 2003-04-29 (www-tag@w3.org from April 2003)

From: Williams, Stuart <skw@hplb.hpl.hp.com>
Date: Tue, 29 Apr 2003 14:28:52 +0100
To: "'Dan Connolly'" <connolly@w3.org>
Cc: www-tag@w3.org
Message-ID: <5E13A1874524D411A876006008CD059F04A074DF@0-mail-1.hpl.hp.com>
> From: Dan Connolly [mailto:connolly@w3.org]
> Sent: 29 April 2003 05:26
> To: www-tag@w3.org
> Subject: consistency between namspaces 1.1 and URI spec (RFC2396-bis)
> 
> Bray and Berners-Lee seemed to say, today, that you couldn't
> write software that conforms to both the namespaces
> spec and RFC2396bis.
> 
> I don't see why not.
> 
> I can see two coherent positions on IRIEverywhere
> and URIEquivalence: identifiers in Web Architecture
> are strings over either a <96 character alphabet
> or over a >10000 character alphabet.
> 
> The examples in section 2.3 Comparing IRI References
> of the 18Dec namespaces CR
>   http://www.w3.org/TR/2002/CR-xml-names11-20021218/#IRIComparison
> are very useful for explaining both the coherent
> positions.
> 
> There are 4 lists of examples. The first is:
> 
>   * http://www.example.org/wine
>   * http://www.Example.org/wine
>   * http://www.example.org/Wine
> 
> In both the <96 and the >1000 positions, there
> are three distinct identifiers in that list.
> 
> On that much we are all agreed, yes?

Well... I'd really like to respond with a simple yes, but... I think that
there is scope in RFC 2396 Section 6 and RFC 2616 Section 3.2.3 to regard
the first two identifiers as equivalent (I don't know if that is synonymous
with 'not-distinct') - they differ solely by letter case in the authority
field. RFC2396 Section 6 delegates equivalence and normalisation to URI
scheme definitions

[[ In general, the rules for equivalence and definition of a normal form, if
any, are scheme dependent.]]

RFC2616 Sec 3.2.3, defines the http URI scheme, and mandates
case-insensitive comparision of host names - it goes on to give equally
concrete examples, although it asserts equivalence rather than (non-)
distinctiveness.

From RFC 2616 section 3.2.2:
[[
   For example, the following three URIs are equivalent:

      http://abc.com:80/~smith/home.html
      http://ABC.com/%7Esmith/home.html
      http://ABC.com:/%7esmith/home.html
]]

When the TAG first discussed URIEquivlance F2F in Nov 2002, we discussed
there being multiple potential equivalence relations between URI strings [*]
and URI equivalence relations being scoped by purpose - Dan gave some nice
examples some time ago [1].

[[ 
Also: this suggests that there's just one relationship between URIs. I think
it's CRITICAL to be 100% clear that there are several:

	identical, i.e. string-equal
	dns-equivalent, e.g. http://www.w3.org/ and http://WWW.W3.ORG/
	http-scheme-equivalent,
		e.g. http://Example.COM:80/ and http://example.com:80/
	cache-hit-likely-equivalent, e.g.
		http://example/ and http://example/index.html

and so on. And the cache-hit-likely-equivalent relation is usually
parameterized by information that the consumer has picked up while
interacting with the web; e.g. HTTP redirection replies and such.
]]

Tim Bray's writing that has propagated into section 6 of RFC2396bis tries to
respect this notion of multiple equivalence relations. So two URI strings
may be equivalent for the purposes of resource access, but distinct (not
equivalent) for the purpose of naming a namespace.

OTOH Roy, I think, speaks of URI equivalence as a single, purpose
independent equivalence relation - that flows from URI equivalence as
defined in RFC2396 and delegated onward to individual URI scheme
definitions.

*IF* we can accept that different equivalence relations hold for different
purposes *THEN* I agree with Dan (and the Namespace 1.1 CR) that the three
identifiers he lists above are distinct *for the purposes of naming a
namespace*. But this takes us to a position of having to qualify assertions
of equivalence with statement of purpose. The one equivalence which all the
others respect is character-by-character equivalence of URI strings - ie.
the identity relation between URI strings.

And... I agree with Misha [1] that *if* this is a problem it is as much a
problem for URI as for IRI when naming namespaces - and discussion to this
point in the message has not stepped into IRI territory.

[1] http://lists.w3.org/Archives/Public/www-tag/2003Jan/0132.html
[2]
http://www.w3.org/mid/6C7917E7CF594D4D927281EAED8E326E5DC401@LONSMSXM02.emea
.ime.reuters.com
[*] I forced myself to append the word 'stringss here, because there seems
to be a platonic sense of URI as well as the very concrete thing of
references being made using sequences of characters written on paper, buses
or stored in files on computers... multiple 'spellings' of the same URI as
opposed to multiple URI for the 'same' resource.

<snip/>

> 
> -- 
> Dan Connolly, W3C http://www.w3.org/People/Connolly/
> 
> 

Regards

Stuart
Received on Tuesday, 29 April 2003 09:29:55 UTC