Re: RDF Update Feeds + URI time travel on HTTP-level from Erik Hetzner on 2009-11-24 (public-lod@w3.org from November 2009)

From: Erik Hetzner <erik.hetzner@ucop.edu>
Date: Mon, 23 Nov 2009 17:13:04 -0800
To: Linked Data community <public-lod@w3.org>
Message-ID: <P-IRC-EXBE01ntdb2HV000029cc@EX.UCOP.EDU>

At Tue, 24 Nov 2009 10:14:01 +1000,
Peter Ansell wrote:
> 2009/11/24 Erik Hetzner <erik.hetzner@ucop.edu>:
> […]
> > On the other hand, there is, nothing I can see that prevents one URI
> > from representing another URI as it changes through time. This is
> > already the case with, e.g.,
> > <http://web.archive.org/web/*/http://example.org>, which represents
> > the URI <http://example.org> at all times. So this URI could, perhaps,
> > be a target for X-Accept-Datetime headers.
> 
> This is still a different URI though, and requires you to know that
> web.archive.org exists and that it has infact trawled example.org.

I agree. I was trying to suggest that, while I agree with Mark Baker
that:

  all HTTP requests, no matter the headers, are requests upon the
  current state of the resource identified by the Request-URI, and
  therefore, a request for a representation of the state of "Resource
  X at time T" needs to be directed at the URI for "Resource X at time
  T", not "Resource X".

there could conceivably be a resource, e.g.,
<http://web.archive.org/web/*/http://example.org/>, whose
representation could vary based on HTTP headers because it represents
all versions of another resource <http://example.org/> as that other
resource varied across time.

> The clean aspect of using headers is that you don't have to munge
> the URI or attach it to the path of another URI in order to make the
> process work.

I agree that it is nice to be able to not munge URIs to get archival
content. Rewriting URIs for archived web content is a very difficult
task which is prone to error, and if a user is browsing a web archive
they often end up with ‘live’ (unarchived) web content in embeds, etc.
instead of the archived content.

But if the tradeoff for not munging URIs is to hide the archival
nature of a resource in the HTTP headers I don’t think it is worth it.

> > To take the canonical example, if I am viewing
> > <http://oakland.example.org/weather>, I don’t want the fact that I
> > am viewing historical weather information to be hidden in the
> > request headers.
> 
> The user-agent could help here.

Perhaps it could, but I don’t think overloading the meaning of the
resource that currently represents the current weather with historical
weather data is a good idea.

> Current web citation methods typically require that you put "Accessed
> on DD MM YY" next to the URI if you want to publish it. If you were
> viewing it at T1 and that wasn't the current version then your
> user-agent would need to let you know that you were not viewing the
> most up to date copy of the resource.

I would prefer to move away from current web citation methods. These
methods provide no way for an author to ensure that (as much as
possible) a reader will encounter the same text that the author read,
and they provide no way for the typical reader to find the text as it
was read by the author.

If we are enhancing user agents and requiring user interaction, why
not enhance a user agent with a feature that, given resource X at the
current time T, directs a user to a new URI which uniquely identifies
resource X at time T, a URI that can be copied & pasted as a whole
into a document. Then the author can be reasonably assured that a
reader will be viewing the same content the author viewed.

> > I think that those of us in the web archiving community [1] would very
> > much appreciate a serious look by the web architecture community into
> > the problem of web archiving. The problem of representing and
> > resolving the tuple <URI, time> is a question which has not yet been
> > adequately dealt with.
> 
> It would still be nice to solve the issue in general so that we don't
> have to rely on archiving services in order to get past versions if
> you could do it by negotiating directly with the original server.

Agreed! Furthermore, it would be nice to solve the problem in such a
way that:

a) the server could provide the past version;
b) failing that, web archive A could provide the past version;
c) failing that, web archive B could provide the past version;
d) and so on.

best,
Erik Hetzner

;; Erik Hetzner, California Digital Library
;; gnupg key id: 1024D/01DB07E3

Received on Tuesday, 24 November 2009 01:13:48 UTC