RE: Comments on Working Draft 5 July 2004 - Avoiding URI aliases

 Hello Renato,

> -----Original Message-----
> From: Renato Iannella [mailto:renato@iannella.it] 
> Sent: 9 July 2004 03:22
> To: Williams, Stuart
> Cc: public-webarch-comments@w3.org
> Subject: Re: Comments on Working Draft 5 July 2004 - Avoiding 
> URI aliases
> 
> 
> On 8 Jul 2004, at 19:26, Williams, Stuart wrote:
> 
> > There is no contradiction here ie. the resources are different, their 
> > current representations are identical.
> 
> Given this clarification, can't all cases of "aliases" then 
> be "avoided" by saying that URI-1 identifies my current 
> resource, and URI-2 identifies my resource on a specific date?

I'd be careful about word like 'all' and their scope. A publisher may say
anything that they like, or remain silent, about the relationships between
the URI they publish. In general, how is an entity that observes these URI
supposed to know? Plain fact is that it isn't supposed to know or guess. wrt
to URI-2 as you describe it above would you expect the date space to be
fully populated - so the .../WD-webarch-20040709 would todays versions of
the webarch document? You may be interested in the DURI/TDB URI scheme
internet draft from Larry Masinter [1] which takes a cut at this kind of a
problem - but in general the URI identifies a resource, not a particular
representation of its state.

There is a position which says that all distinct URI necessarily identify
distinct resources (not just lexically distinct URI, but distinct by URI
spec and relevant scheme comparison 'rules', eg. case insensitivety,
presense/absense of port field with default value etc.).  Roy's work on REST
defines a resource by its time-varying representations over all future (and
likely past) time and it is generally not possible to make the commitment
that two URI that yield identical representations at a given instant (in a
given retrieval context - user-agent, security context...) will continue to
do so over all future time. There may be such a commitment on the part of
the 'owner' of the URI, but it is not evident from the URIs themselves.

I think that there are two cases. 

The example you cited was the very deliberate way W3C associate URI with
current version and particular versions of documents. The current version
resource and the particular version resource, although they may appear the
identical (to an observer) for some period of time when the particular
version is current, are (deliberately) distinct resources. 

The other case is where a given resource is deployed, and maybe it has a
particularly awkward URI that folks are prone to misspell, so as a
convenience the resource is deployed at multiple URI, or (depending on
philosophical viewpoints) multiple (equivalent) clones of the resource are
deployed to catch common misspellings eg:

	http://www.example.com/People/StuartWilliams
	http://www.example.com/people/stuartWilliams
	http://www.example.com/people/stuart_williams
      ....

In this case the intention really is to deploy one resource, and clients can
only really, really know that multiple references refer to the same resource
*iff* the referring URIRefs are equivalent (not just lexically, but as
above). Gratuitiously deploying multiple URI for the same resource (or
clones of the same resource) removes an opportunity to assert that a
referents are equivalent, even identical. IMO a better approach would be to
deploy the resource at a single, say canonical, URI and to use redirection
to catch the misspelling in a away that communicates the canonical URI back
to the user-agent, and hence propagates it (rather than an 'alias') into
bookmarks, or references shared in emails or embedded into other pages.

> Cheers 
> 
> Renato Iannella
> http://renato.iannella.it

Regards

Stuart

Received on Friday, 9 July 2004 05:26:06 UTC