Re: RDF Update Feeds from Michael Hausenblas on 2009-11-21 (public-lod@w3.org from November 2009)

From: Michael Hausenblas <michael.hausenblas@deri.org>
Date: Sat, 21 Nov 2009 11:19:18 +0000
To: Hugh Glaser <hg@ecs.soton.ac.uk>, Georgi Kobilarov <georgi.kobilarov@gmx.de>
CC: Linked Data community <public-lod@w3.org>
Message-ID: <C72D7D36.A5E6%michael.hausenblas@deri.org>
Georgi, Hugh,

>> Could be very simple by expressing: "Pull our update-stream once per
>> seconds/minute/hour in order to be *enough* up-to-date".

Ah, Georgi, I see. You seem to emphasise the quantitative side whereas I
just seem to want to flag what kind of source it is. I agree that  "Pull our
update-stream once per seconds/minute/hour in order to be *enough*
up-to-date" should be available, however I think that having the information
regular/irregular vs. how frequent the update should be made available as
well. My main use case is motivated from the LOD application-writing area. I
figured that I quite often have written code that essentially does the same:
based on the type of data-source it either gets a live copy of the data or
uses already local available data. Now, given that data set publisher would
declare the characteristics of their dataset in terms of dynamics, one could
write such a LOD cache quite easily, I guess, abstracting the necessary
steps and hence offering a reusable solution. I'll follow-up on this one
soon via a blog post with a concrete example.

My main question would be: what do we gain if we explicitly represent these
characteristics, compared to what HTTP provides in terms of caching [1]. One
might want to argue that the 'built-in' features are sort of too fine
granular and there is a need for a data-source-level solution.

> in our semantic sitemaps, and these suggestions seem very similar.
> Eg
> http://dotac.rkbexplorer.com/sitemap.xml
> (And I think these frequencies may correspond to "normal" sitemaps.)
> So a naïve approach, if you want RDF, would be to use something very similar
> (and simple).
> Of course I am probably known for my naivity, which is often misplaced.

Hugh, of course you're right (as often ;). Technically, this sort of
information ('changefreq') is available via sitemaps. Essentially, one could
lift this to RDF straight-forward, if desired. If you look closely to what I
propose, however, then you'll see that I aim at a sort of qualitative
description which could drive my LOD cache (along with the other information
I already have from the void:Dataset).

Now, before I continue to argue here on a purely theoretical level, lemme
implement a demo and come back once I have something to discuss ;)


Cheers,
      Michael

[1] http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html

-- 
Dr. Michael Hausenblas
LiDRC - Linked Data Research Centre
DERI - Digital Enterprise Research Institute
NUIG - National University of Ireland, Galway
Ireland, Europe
Tel. +353 91 495730
http://linkeddata.deri.ie/
http://sw-app.org/about.html



> From: Hugh Glaser <hg@ecs.soton.ac.uk>
> Date: Fri, 20 Nov 2009 18:29:17 +0000
> To: Georgi Kobilarov <georgi.kobilarov@gmx.de>, Michael Hausenblas
> <michael.hausenblas@deri.org>
> Cc: Linked Data community <public-lod@w3.org>
> Subject: Re: RDF Update Feeds
> 
> Sorry if I have missed something, but...
> We currently put things like
> <changefreq>monthly</changefreq>
> <changefreq>daily</changefreq>
> <changefreq>never</changefreq>
> in our semantic sitemaps, and these suggestions seem very similar.
> Eg
> http://dotac.rkbexplorer.com/sitemap.xml
> (And I think these frequencies may correspond to "normal" sitemaps.)
> So a naïve approach, if you want RDF, would be to use something very similar
> (and simple).
> Of course I am probably known for my naivity, which is often misplaced.
> Best
> Hugh
> 
> On 20/11/2009 17:47, "Georgi Kobilarov" <georgi.kobilarov@gmx.de> wrote:
> 
>> Hi Michael,
>> 
>> nice write-up on the wiki! But I think the vocabulary you're proposing is
>> too much generally descriptive. Dataset publishers, once offering update
>> feeds, should not only tell that/if their datasets are "dynamic", but
>> instead how dynamic they are.
>> 
>> Could be very simple by expressing: "Pull our update-stream once per
>> seconds/minute/hour in order to be *enough* up-to-date".
>> 
>> Makes sense?
>> 
>> Cheers,
>> Georgi 
>> 
>> --
>> Georgi Kobilarov
>> www.georgikobilarov.com
>> 
>>> -----Original Message-----
>>> From: Michael Hausenblas [mailto:michael.hausenblas@deri.org]
>>> Sent: Friday, November 20, 2009 4:01 PM
>>> To: Georgi Kobilarov
>>> Cc: Linked Data community
>>> Subject: Re: RDF Update Feeds
>>> 
>>> 
>>> Georgi, All,
>>> 
>>> I like the discussion, and as it seems to be a recurrent pattern as
>>> pointed
>>> out by Yves (which might be a sign that we need to invest some more
>>> time
>>> into it) I've tried to sum up a bit and started a straw-man proposal
>>> for a
>>> more coarse-grained solution [1].
>>> 
>>> Looking forward to hearing what you think ...
>>> 
>>> Cheers,
>>>       Michael
>>> 
>>> [1] http://esw.w3.org/topic/DatasetDynamics
>>> 
>>> --
>>> Dr. Michael Hausenblas
>>> LiDRC - Linked Data Research Centre
>>> DERI - Digital Enterprise Research Institute
>>> NUIG - National University of Ireland, Galway
>>> Ireland, Europe
>>> Tel. +353 91 495730
>>> http://linkeddata.deri.ie/
>>> http://sw-app.org/about.html
>>> 
>>> 
>>> 
>>>> From: Georgi Kobilarov <georgi.kobilarov@gmx.de>
>>>> Date: Tue, 17 Nov 2009 16:45:46 +0100
>>>> To: Linked Data community <public-lod@w3.org>
>>>> Subject: RDF Update Feeds
>>>> Resent-From: Linked Data community <public-lod@w3.org>
>>>> Resent-Date: Tue, 17 Nov 2009 15:46:30 +0000
>>>> 
>>>> Hi all,
>>>> 
>>>> I'd like to start a discussion about a topic that I think is getting
>>>> increasingly important: RDF update feeds.
>>>> 
>>>> The linked data project is starting to move away from releases of
>>> large data
>>>> dumps towards incremental updates. But how can services consuming rdf
>>> data
>>>> from linked data sources get notified about changes? Is anyone aware
>>> of
>>>> activities to standardize such rdf update feeds, or at least aware of
>>>> projects already providing any kind of update feed at all? And
>>> related to
>>>> that: How do we deal with RDF diffs?
>>>> 
>>>> Cheers,
>>>> Georgi
>>>> 
>>>> --
>>>> Georgi Kobilarov
>>>> www.georgikobilarov.com
>>>> 
>>>> 
>>>> 
>> 
>> 
>
Received on Saturday, 21 November 2009 11:19:55 UTC