W3C home > Mailing lists > Public > www-tag@w3.org > January 2011

Re: "tdb" and "duri" URI schemes...

From: Herbert van de Sompel <hvdsomp@gmail.com>
Date: Tue, 18 Jan 2011 09:06:13 -0700
Message-ID: <AANLkTim_77c777wik4Bhc1LQPy0g+77y4RuUDQ2fg99g@mail.gmail.com>
To: "www-tag@w3.org" <www-tag@w3.org>
Cc: Larry Masinter <masinter@adobe.com>
hi Larry,

I provide some comments to the tbd/duri I-D below.

Greetings

Herbert

==

1. There is an expectation to see a section "1.3. URIs for resource state"
that introduces the notion of dated URIs, after having read the section
"1.2. URIs for anything" that introduces tdb.

2. I am not in favor of the tdb scheme also having the capability to be
timestamped. I think there should only be one scheme with that capability,
namely duri:

- The motivations you give in the 2nd paragraph of 3.3 can also be
fullfilled by using a duri embedded in a tdb: tdb:duri:timestamp:URI : A tdb
URI can not exist without a describing resource, and either (a) that
describing document itself refers to a specific date (the president in 2002)
in which case interpretation is clear or (b) that describing document itself
does not refer to specific (the president) in which case one interprets the
description subject to duri's timestamp.

- The IETF example in Section 3.2. becomes: tdb:duri:2009:
http://en.wikipedia.org/wiki/IETF and (at first glance) I see no ambiguity
resulting from this approach.

- This approach can indeed not be used with a "data" URI but I don't regard
that as a significant problem compared to the semantic interpretation
complexity that results from 2 timestamped schemes, i.e.
tdb:timestamp:duri:timestamp:URI

3. The case for "resource state":

(a) There is some ambiguity in formulation re duri:
- "identifies a resource as of a particular time" (in the abstract)
- "the resource that was identified by the <embeddedURI> at the time given"
(section 3.1.)
- "a URI as of a particular time" (section 7.1.)

(b) There is a significant problem regarding the inability to identify
representations: Section 6.5 describes this problem extensively for tdb
resources, and briefly for duri resources that embed e.g. HTTP URIs. The
latter is obviously an issue we face in Memento too, as one can never return
_the_ representation that was seen by some client at some point in time. The
best one can do is return _a_ representation of the _state_ the resource had
at that point in time.

Hence, I think that the definition of duri might benefit from building on
the notion of "resource state" like we do in Memento. Once could say that a
duri identifies the resource that has as URI the embeddedURI, in the state
it was at the given timestamp. That also allows saying, in the section on
resolution, that the result of dereferencing a duri (through some mechanism)
is _a_ representation of the state of the resource the resource indicated by
the duri had at the timestamp moment.

I actually used this approach in the Memento I-D when referring to duri:

1. The abstract notion of the state of a resource identified by URI-R as it
existed at some time Tj. Note the relationship with the ability to identify
a the state of a resource at some datetime Tj by means of a URI as intended
by the proposed Dated URI scheme
I-D.masinter-dated-uri<http://www.mementoweb.org/guide/rfc/ID/#I-D.masinter-dated-uri>
 [I-D.masinter-dated-uri].

And you use "resource state" towards the end of Section 3.3.

4. I was wondering why embedded fragments identifiers are not allowed in
DURIs. Is that because of syntax issues? Whichever way, this is a bummer.
Assume we have a time-evolving dataset at URI1 and that URI1#frag1
identifies a data-element in that dataset. Then, it would be great to be
able to use duri to identify the evolving value of that date-element by
having duri:2003:URI1#frag1 , duri:2004:URI1#frag1 etc. This is something I
actually talked about at a conference a while ago in relation to the use of
duri in scientific communication.

5. It would seem fair to refer to Memento in Section 6.4. on resolution. As
you know we have an I-D now at
https://datatracker.ietf.org/doc/draft-vandesompel-memento/

6. I have a problem with Section 6.8., especially the interpretation of the
URIs used by the Internet Archive. While indeed these URIs are timestamped,
there is no way to _know_ they are as we need to treat URIs as opaque and
hence are not supposed to glance the dates from the URIs. Not at an
architectural level, at least. Maybe that is why you use the term "Hypertext
system" and then refer to the Internet Archive as a "closed hypertext
system" that happens to use web technology.  That interpretation is probably
OK. But under Memento, these archives will no longer be silos but rather be
fully integrated into the Web, and their URIs will be treated as any other
old URI.

7. I came across some typos here and there but I guess those can wait until
last call.


-- 
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/

==
Received on Tuesday, 18 January 2011 16:06:46 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:29 GMT