W3C home > Mailing lists > Public > www-tag@w3.org > November 2010

Re: "tdb" and "duri" URI schemes...

From: David Booth <david@dbooth.org>
Date: Tue, 02 Nov 2010 17:32:54 -0400
To: Larry Masinter <masinter@adobe.com>
Cc: Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <1288733574.2914.22749.camel@dbooth-laptop>
Hi Larry,

I like the idea of adding a timestamp or cryptographic hash (as Sandro
Hawke has suggested) to a URI, as a way of indicating which instance of
a URI declaration was used when the URI was minted.  However, I do not
believe there is any need to define a new URI scheme to do this.
Instead, the same objectives can be obtained to greater advantage by
leveraging http URIs using techniques described at
http://dbooth.org/2006/urn2http/ 

Your premise seems to be that the institutional process of registering a
new URI scheme makes URIs that were minted under that scheme are
inherently more "persistent" than URIs that were minted with the http
scheme, which uses an authority component whose owner may change over
time.  But I don't think this is true.  Rather, tdb URIs simply abdicate
all responsibility for persistence.

The document at
http://tools.ietf.org/html/draft-masinter-dated-uri-07
does not define what it means by "persistence", but presumably it refers
to the future ability by a URI consumer to determine the description (or
"URI declaration") that the URI owner published when the URI was minted,
as described in "The URI Lifecycle in Semantic Web Architecture":
http://dbooth.org/2009/lifecycle/ 
This means that the URI consumer must be able to do two things: (1)
locate a candidate URI declaration; and (2) verify that the candidate is
the correct candidate, i.e., that it corresponds to the timestamp
specified in the URI.  Note that these two steps are needed regardless
of whether the URI is based on the domain name authority.

For step 1 (locating a candidate URI declaration), the tdb scheme that
you propose provides no automated help, because a tdb URI cannot be
dereferenced.  In contrast, an http URI *might* be dereferenceable.  In
the worst case, an http URI that is never dereferenceable will be no
worse that a tdb URI, but in the best case it will be dereferenceable
and hence more useful.  Furthermore, URI consumers are much more likely
to care about URIs that are younger and owned by reputable
organizations, and these are precisely the URIs that are more likely to
still be dereferenceable, thus further skewing the advantage to http
URIs.  

For step 2 (verifying that the candidate URI declaration is the correct
one), neither tdb nor http URIs provide any help: a user must manually
guess what instance of a URI declaration was applicable at the date/time
encoded in the URI.

Final score: tdb 0, http 1.

For example, consider the space of URIs beginning with
http://thing-described-by.org/ , or http://t-d-b.org/ for short.  The
CGI script on that site could clearly be extended such that URIs of the
form:

 http://t-d-b.org/tdb:<timestamp>:<encoded-URI>

would have the exact same semantics as a tdb URI of the form:

 tdb:<timestamp>:<encoded-URI>

but the t-d-b.org URI would provide the additional advantage that the
server does a 303 redirect to <encoded-URI> when the URI is
dereferenced.  (In a practical sense, this capability is already
available through the Internet Archive http://www.archive.org/ .)

Furthermore, if the http URI included a cryptographic hash of the
original URI declaration, as suggested by Sandro Hawke
http://www.w3.org/2003/08/introhash/latest 
the both step 1 (locating the candidate URI) and step 2 (verifying that
it is the correct instance) could be fully automated if one chose to do
so.  To be fair, such a process could also be fully automated with a new
tdb URI scheme, but realistically, I strongly suspect that developers
would be more interested in implementing support based on good old http
than a new URI scheme.  

Bottom line: the http scheme has the enormous benefit of the network
effect.  Hence, it is much more advantageous to base ideas like this on
the http scheme than to invent a new URI scheme.

Best wishes,
David Booth

P.S. Typos noticed in your draft:

s/define define/define/

Section 3.2 has a paragraph beginning "While one could imagine using
'tdb' without a date", but substantially similar text also appears in
the second paragraph of section 3.3.


On Tue, 2010-11-02 at 08:38 -0700, Larry Masinter wrote:
> This idea has been bouncing around for such a long time,
> but I updated the document 
> 
> http://tools.ietf.org/html/draft-masinter-dated-uri-07
> 
> based on comments.
> 
> While this isn't posed as a "TAG" submission, since the
> TAG has been discussing persistence for a long time,
> are there any changes you think I should make (references,
> discussions, etc.) I should make before asking for this
> to be published?
> 
> Larry
> --
> http://larry.masinter.net
> 

-- 
David Booth, Ph.D.
Cleveland Clinic (contractor)
http://dbooth.org/

Opinions expressed herein are those of the author and do not necessarily
reflect those of Cleveland Clinic.
Received on Tuesday, 2 November 2010 21:33:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:28 GMT