RE: "tdb" and "duri" URI schemes... from Larry Masinter on 2011-01-17 (www-tag@w3.org from January 2011)

From: Larry Masinter <masinter@adobe.com>
Date: Mon, 17 Jan 2011 14:38:03 -0800
To: David Booth <david@dbooth.org>
CC: Jonathan Rees <jar@creativecommons.org>, "www-tag@w3.org" <www-tag@w3.org>
Message-ID: <C68CB012D9182D408CED7B884F441D4D058EB90468@nambxv01a.corp.adobe.com>
In the new version of 
http://tools.ietf.org/html/draft-masinter-dated-uri


but I didn't add text in response to your comments.

> I like the idea of adding a timestamp or cryptographic hash (as Sandro
> Hawke has suggested) to a URI, as a way of indicating which instance of
> a URI declaration was used when the URI was minted.  However, I do not
> believe there is any need to define a new URI scheme to do this.

I don't agree that existing schemes do what "duri" and "tdb" do.
I'm not sure the new capabilities are "needed", but I believe that
other schemes don't offer them.

> Instead, the same objectives can be obtained to greater advantage by
> leveraging http URIs using techniques described at
> http://dbooth.org/2006/urn2http/ 

I don't agree, in that the techniques laid out there don't address
persistence in the cases where XyzConsortium disbands or is no longer
interested in the namespace it once sponsored.

> Your premise seems to be that the institutional process of registering a
> new URI scheme makes URIs that were minted under that scheme are
> inherently more "persistent" than URIs that were minted with the http
> scheme, which uses an authority component whose owner may change over
time. 

Not at all; "duri" does not require "minting", in the sense of
creating a token which has a meaning other that is imbued by the
act of striking the die; I've tried to make this clear in the
wording distinguishing "duri" and "tdb" from URNs, info, tag,
etc.

>  But I don't think this is true.  Rather, tdb URIs simply abdicate
> all responsibility for persistence.

I'm not sure what this means....Who has what responsibility
that is "abdicated"?


> The document at
> http://tools.ietf.org/html/draft-masinter-dated-uri-07

> does not define what it means by "persistence", but presumably it refers
> to the future ability by a URI consumer to determine the description (or
> "URI declaration") that the URI owner published when the URI was minted,
> as described in "The URI Lifecycle in Semantic Web Architecture":
> http://dbooth.org/2009/lifecycle/ 

This definition seems to conflate "ability to determine" with
"confidence that, once determined, the determination is correct".
It might be difficult to determine what a "tdb" with an old date
actually said, but at least it will be clearer what *was* meant
at least as of the time of creation of the description resource.

> This means that the URI consumer must be able to do two things: (1)
> locate a candidate URI declaration; and (2) verify that the candidate is
> the correct candidate, i.e., that it corresponds to the timestamp
> specified in the URI.  Note that these two steps are needed regardless
> of whether the URI is based on the domain name authority.

ok

> For step 1 (locating a candidate URI declaration), the tdb scheme that
> you propose provides no automated help, because a tdb URI cannot be
> dereferenced.  In contrast, an http URI *might* be dereferenceable. 

I don't understand how this is helpful; the http URI
might be dereferenceable at some time in the future, but of course,
that dereferenced resource might have nothing to do with the one
available at the time of use as of the date of the DURI/TDB.

>   In
the worst case, an http URI that is never dereferenceable will be no
> worse that a tdb URI, 

I don't agree at all, but perhaps I don't know what you mean by
"worse" -- along what dimension is it worse?

> but in the best case it will be dereferenceable
> and hence more useful. 

A dereference which provides a wrong or irrelevant value is less useful
than one that that cannot be dereferenced.

>  Furthermore, URI consumers are much more likely
> to care about URIs that are younger and owned by reputable
> organizations, and these are precisely the URIs that are more likely to
> still be dereferenceable,

I'm not sure this is at all obvious.... reputable organizations routinely
recycle URIs.

>  thus further skewing the advantage to http URIs.  

I don't accept that the "advantage" is "skewed".

> For step 2 (verifying that the candidate URI declaration is the correct
> one), neither tdb nor http URIs provide any help: a user must manually
> guess what instance of a URI declaration was applicable at the date/time
> encoded in the URI.

http URIs don't tell you the date/time, while duri/tdb tells you explicitly.

> Final score: tdb 0, http 1.

I don't think you're scoring the wrong thing.

> For example, consider the space of URIs beginning with
> http://thing-described-by.org/ , or http://t-d-b.org/ for short.  The
> CGI script on that site could clearly be extended such that URIs of the
> form:

>  http://t-d-b.org/tdb:<timestamp>:<encoded-URI>

> would have the exact same semantics as a tdb URI of the form:

>  tdb:<timestamp>:<encoded-URI>

I disagree strongly. The http://t-d-b.org URIs can only identify
resources available by accessing the HTTP server being offered at
t-d-b.org; if t-d-b.org domain name's registration expires, etc.

> but the t-d-b.org URI would provide the additional advantage that the
> server does a 303 redirect to <encoded-URI> when the URI is
> dereferenced. 

except of course if t-d-b.org loses its domain name registration,
lets its server go down, etc.

>  (In a practical sense, this capability is already
> available through the Internet Archive http://www.archive.org/ .)

Using archive.org is a hueristic, not a definition of the meaning.


> Furthermore, if the http URI included a cryptographic hash of the
> original URI declaration, as suggested by Sandro Hawke
> http://www.w3.org/2003/08/introhash/latest 
> the both step 1 (locating the candidate URI) and step 2 (verifying that
> it is the correct instance) could be fully automated if one chose to do
> so.

Not sure I understand how something that is impractical could be
"fully automated".

>  To be fair, such a process could also be fully automated with a new
> tdb URI scheme, but realistically, I strongly suspect that developers
> would be more interested in implementing support based on good old http
> than a new URI scheme.  

Not sure how you can "fully automate" something that isn't feasible.

> Bottom line: the http scheme has the enormous benefit of the network
> effect.  Hence, it is much more advantageous to base ideas like this on
> the http scheme than to invent a new URI scheme.

Do not agree.

s/define define/define/

> Section 3.2 has a paragraph beginning "While one could imagine using
> 'tdb' without a date", but substantially similar text also appears in
> the second paragraph of section 3.3.

Thanks will double check these.
Received on Monday, 17 January 2011 22:38:42 UTC