Re: RDF Update Feeds + URI time travel on HTTP-level from Herbert Van de Sompel on 2009-11-22 (public-lod@w3.org from November 2009)

From: Herbert Van de Sompel <hvdsomp@gmail.com>
Date: Sun, 22 Nov 2009 10:13:41 -0700
To: Linked Data community <public-lod@w3.org>
Message-Id: <768EADFA-A56E-4C08-A92A-ACB49CCE3D1D@gmail.com>
[tried to send this before but somehow did not get through to list]

hi all,

(thanks Chris, Richard, Danny)

In light of the current discussion, I would like to provide some  
clarifications regarding "Memento: Time Travel for the Web", ie the  
idea of introducing HTTP content negotiation in the datetime dimension:

(*) Some extra pointers:

- For those who prefer browsing slides over reading a paper, there is http://www.slideshare.net/hvdsomp/memento-time-travel-for-the-web

- Around mid next week, a video recording of a presentation I gave on  
Memento should be available at http://www.oclc.org/research/dss/default.htm

- The Memento site is at http://www.mementoweb.org. Of special  
interest may be the proposed HTTP interactions for (a) web servers  
with internal archival capabilities such as content management  
systems, version control systems, etc (http://www.mementoweb.org/guide/http/local/ 
) and (b) web servers without internal archival capabilities (http://www.mementoweb.org/guide/http/remote/ 
).

(*) The overall motivation for the work is the integration of archived  
resources into regular web navigation by making them available via  
their original URIs. The archived resources we have focused on in our  
experiments so far are those kept by:

(a) Web Archives such as the Internet Archive, Webcite, archive-it.org  
and

(b) Content Management Systems such as wikis, CVS, ...
The reason I pinged Chris Bizer about our work is that we thought that  
our proposed approach could also be of interest in the LoD  
environment.  Specifically, the ability to get to prior descriptions  
of LoD resources by doing datetime content negotiation on their URI  
seemed appealing; e.g. what was the dbpedia description for the City  
of Paris on March 20 2008? This ability would, for example, allow  
analysis of (the evolution of ) data over time. The requirement that  
is currently being discussed in this thread (which I interpret to be  
about approaches to selectively get updates for a certain LoD  
database) is not one I had considered using Memento for, thinking this  
was more in the realm of feed technologies such as Atom (as suggested  
by Ed Summers), or the pre-REST OAI-PMH (http://www.openarchives.org/OAI/openarchivesprotocol.html 
).

(*) Regarding some issues that were brought up in the discussion so far:

- We use an X header because that seems to be best practice when doing  
experimental work. We would very much like to eventually migrate to a  
real header, e.g. Accept-Datetime.

- We are definitely considering and interested in some way to  
formalize our proposal in a specification document. We felt that the I- 
D/RFC path would have been the appropriate one, but are obviously open  
to other approaches.

- As suggested by Richard, there is a bootstrapping problem, as there  
is with many new paradigms that are introduced. I trust LoD developers  
fully understand this problem. Actually, the problem is not only at  
the browser level but also at the server level. We are currently  
working on a FireFox plug-in that, when ready, will be available  
through the regular channels. And we have successfully (and  
experimentally) modified the Mozilla code itself to be able to  
demonstrate the approach. We are very interested in getting support in  
other browsers, natively or via plug-ins. We also have some tools  
available to help with initial deployment (http://www.mementoweb.org/tools/ 
  ). One is a plug-in for the mediawiki platform; when installed the  
wiki natively supports datetime content negotiation and redirects a  
client to the history page that was active at the datetime requested  
in the X-Accept-Header. We just started a Google group for developers  
interested in making Memento happen for their web servers, content  
management system, etc. (http://groups.google.com/group/memento-dev/).

(*) Note that the proposed solution also leverages the OAI-ORE  
specification (fully compliant with LoD best practice) as a mechanism  
to support discovery of archived resources.

I hope this helps to get a better understanding of what Memento is  
about, and what its current status is. Let me end by stating that we  
would very much like to get these ideas broadly adopted. And we  
understand we will need a lot of help to make that happen.

Cheers

Herbert



On Nov 22, 2009, at 2:39 AM, Danny Ayers wrote:

> 2009/11/22 Richard Cyganiak <richard@cyganiak.de>:
>> On 20 Nov 2009, at 19:07, Chris Bizer wrote:
>
> [snips]
>
>> From a web architecture POV it seems pretty solid to me. Doing  
>> stuff via
>> headers is considered bad if you could just as well do it via links  
>> and
>> additional URIs, but you can argue that the time dimension is such a
>> universal thing that a header-based solution is warranted.
>
> Sounds good to me too, but x-headers are a jump, I think perhaps it's
> a question worthy of throwing at the W3C TAG - pretty sure they've
> looked at similar stuff in the past, but things are changing fast...
>
> From what I can gather, proper diffs over time are hard (long before
> you get to them logics). But Web-like diffs don't have to be - can't
> be any less reliable than my online credit card statement. Bit
> worrying there are so many different approaches available, sounds like
> there could be a lot of coding time wasted.
>
> But then again, might well be one for evolution - and in the virtual
> world trying stuff out is usually worth it.
>
>> The main drawback IMO is that existing clients, such as all web  
>> browsers,
>> will be unable to access the archived versions, because they don't  
>> know
>> about the header. If you are archiving web pages or RDF document,  
>> then you
>> could add links that lead clients to the archived versions, but  
>> that won't
>> work for images, PDFs and so forth.
>
> Hmm. For one, browsers are in flux, for two then you probably wouldn't
> expect that kind of agent to give you anything but the latest.
> If I need last years version, I follow my nose through URIs (as in svn
> etc) - that kind of thing has to be a fallback, imho.
>
>> In summary, I think it's pretty cool.
>
> Cool idea, for sure. It is something strong...ok, temporal stuff
> should be available down at quite a low level, especially given that
> things like xmpp will be bouncing around - but I reckon Richard's
> right in suggesting the plain old URI thing will currently serve most
> purposes.
>
> Cheers,
> Danny.
>
> -- 
> http://danny.ayers.name

==
Herbert Van de Sompel
Digital Library Research & Prototyping
Los Alamos National Laboratory, Research Library
http://public.lanl.gov/herbertv/
tel. +1 505 667 1267
Received on Sunday, 22 November 2009 21:41:11 UTC