Re: RDF Update Feeds + URI time travel on HTTP-level from Peter Ansell on 2009-11-23 (public-lod@w3.org from November 2009)

From: Peter Ansell <ansell.peter@gmail.com>
Date: Mon, 23 Nov 2009 14:59:54 +1000
To: Mark Baker <distobj@acm.org>
Cc: Linked Data community <public-lod@w3.org>
Message-ID: <a1be7e0e0911222059w5084ca69kfaad10d51eaa3117@mail.gmail.com>
2009/11/23 Mark Baker <distobj@acm.org>:
> Hi Chris,
>
> On Fri, Nov 20, 2009 at 1:07 PM, Chris Bizer <chris@bizer.de> wrote:
>> Hi Michael, Georgi and all,
>>
>> just to complete the list of proposals, here another one from Herbert Van de
>> Sompel from the Open Archives Initiative.
>>
>> Memento: Time Travel for the Web
>> http://arxiv.org/abs/0911.1112
>>
>> The idea of Memento is to use HTTP content negotiation in the datetime
>> dimension. By using a newly introduced X-Accept-Datetime HTTP header they
>> add a temporal dimension to URIs. The result is a framework in which
>> archived resources can seamlessly be reached via the URI of their original.
>>
>> Sounds cool to me. Anybody an opinion whether this violates general Web
>> architecture somewhere?
>
> IMO, it does.  The problem is that an HTTP request with the
> Accept-Datetime header is logically targeting a different resource
> than the one identified in the Request-URI.  Accept-* headers are for
> negotiating the selection of resource *representations*, not
> resources.  Resource selection should always be handled via
> hypermedia.

I think it general it is likely to target a different representation
of the same resource, just in the time dimension rather than in the
"spatial" format dimensions that Accept headers currently negotiate
with. Arguing that a resource is not different if it has non-equal
binary representations in the format dimension at a particular point
in time, is no different IMO to arguing that the nature of the
resource has not changed because of one or more intentional
"non-nature affecting" change in one of the binary representations
through time. The use of language as an accept header allows people to
select between representations that do not necessarily contain the
same information, as the translation might not be complete, or there
may be semantic ambiguity that makes it impossible to reliably
translate back and forth between the documents without some
information loss.

If it is consensus that the time dimension is always a special case
where the nature of a resource actually changes if the bits ever
change, then I think it would be more appropriate to use different
identifying features such as locators to retrieve the thing, but
currently I think the case is not very convincing given the current
documentation of Accept possibilities.

In a non-RDF example, one might want to examine the changes in the the
resolution of an image that may have been improved overtime as image
resolution algorithms improve. IMO, a more recent document would be
the same image, just with more detail. Arguing that the exact
dimensions and bit representation of the image have changed, but not
the resource, would be currently accepted if the file format changed
because new Accept possibilities can be added without changing the
nature of the web resource. However, if the file format didn't change,
currently we are not sure, but it seems as though it should be treated
a new image resource. This is a contradiction IMO because we have
already said that the bit representation can be non-identical and the
resulting representations can still identify the same resource based
on the use of Accept headers.

In a semi-serious example, if the resource is strictly different every
time something changes, there would be a never ending circle of
updates necessary if two or more documents started out unlinked, but
wanted to link to the other documents in the strictest manner
possible. If semi-constant identifiers are not allowed, every time a
document was updated, the new document would receive a new identifier
which would require both an update to the other document if the owners
of that document wanted their users to have a link to a document that
linked back to them. This update would require a resource locator
change, which would then allow the other document producer to update
both the link and the resource URI to keep its users up to date. In my
opinion it is a very good thing to allow locators to stay
semi-constant, as the web architecture documentation might be
reasonably thought to represent the real web in some way, which it
would not do if this example were taken seriously.

It should be up to resource creators to determine when the nature of a
resource changes across time. A web architecture that requires every
single edit to have a different identifier is a large hassle and
likely won't catch on if people find that they can work fine with a
system that evolves constantly using semi-constant identifiers, rather
than through a series of mandatory time based checkpoints.

Cheers,

Peter
Received on Monday, 23 November 2009 05:00:22 UTC