Re: Wikipedia incremental updates from Joe Presbrey on 2010-01-22 (public-lod@w3.org from January 2010)

From: Joe Presbrey <presbrey@csail.mit.edu>
Date: Fri, 22 Jan 2010 13:56:43 -0500
To: Linked Data community <public-lod@w3.org>
Message-ID: <173a8c251001221056s4d1c8d03ldda91db1e962c230@mail.gmail.com>

(Perhaps I should be asking on dbpedia-discussion though I imagine
many of us here may also wonder:)

Is there a plan for:
Trickling these updates out to other services, eg. sameas.org?
Synchronizing my local dbpedia store to the live service? DSNotify?
Pubsubhubbub?

--
Joe Presbrey


On Fri, Jan 22, 2010 at 3:06 AM, Michael Hausenblas
<michael.hausenblas@deri.org> wrote:
>
> Nicolas,
>
>> Does anyone has experience with that?
>>
>> Is there any other way to retrieve incremental updates in a reliable and
>> continuous way, especially in the same format as the one provided for the
>> static dumps?  (mysql replication, incremental dumps... )
>
> I think this is a very timely and important question [1]. We did a demo
> recently [2], based on voiD and Atom, trying to figure out what could work
> and as a result a group of interested people in this area has formed [3].
> Would be great if you'd join in and share your use case ...
>
> Cheers,
>      Michael
>
> [1] http://esw.w3.org/topic/DatasetDynamics
> [2] http://code.google.com/p/dady/wiki/Demos
> [3] http://groups.google.com/group/dataset-dynamics
>
> --
> Dr. Michael Hausenblas
> LiDRC - Linked Data Research Centre
> DERI - Digital Enterprise Research Institute
> NUIG - National University of Ireland, Galway
> Ireland, Europe
> Tel. +353 91 495730
> http://linkeddata.deri.ie/
> http://sw-app.org/about.html
>
>
>
>> From: Nicolas Torzec <torzecn@yahoo-inc.com>
>> Date: Thu, 21 Jan 2010 19:35:24 -0800
>> To: Linked Data community <public-lod@w3.org>
>> Subject: Wikipedia incremental updates
>> Resent-From: Linked Data community <public-lod@w3.org>
>> Resent-Date: Fri, 22 Jan 2010 03:45:06 +0000
>>
>> Hi there,
>>
>> I am using open data sets such as Wikipedia for data mining and knowledge
>> acquisition purposes; entities and relations extracted being exposed and
>> consumed via indices.
>>
>> I am already retrieving and processing new Wikipedia static dumps every time
>> they are available, but I would like to go beyond this and use
>> incremental/live updates to be more in synch with Wikipedia content.
>>
>> I know that I could use some Web services and IRC Channels for tracking
>> changes in Wikipedia but, beside the fact that the web service has been
>> designed more for tracking individual changes than monitoring Wikipedia
>> changes continuously, these two methods will still require to parse the
>> update messages (for extracting the URLs of the new/modified/deleted pages)
>> and then to retrieve the actual pages.
>>
>> Does anyone has experience with that?
>>
>> Is there any other way to retrieve incremental updates in a reliable and
>> continuous way, especially in the same format as the one provided for the
>> static dumps?  (mysql replication, incremental dumps... )
>>
>> I have also read that DBpedia was trying to be more in sync with Wikipedia
>> content. How do they plan to stay in sync with Wikipedia updates?
>>
>>
>> Thanks for your help.
>>
>> Best,
>> Nicolas Torzec.
>>
>
>
>

Received on Monday, 25 January 2010 02:27:19 UTC