W3C home > Mailing lists > Public > public-lod@w3.org > April 2009

Keeping crawlers up-to-date

From: Yves Raimond <yves.raimond@gmail.com>
Date: Tue, 28 Apr 2009 14:39:45 +0100
Message-ID: <82593ac00904280639k335e8845of5ec04f44b66cc13@mail.gmail.com>
To: Linking Open Data <public-lod@w3.org>
Cc: Nicholas J Humfrey <njh@aelius.com>, Patrick Sinclair <metade@gmail.com>

I know this issue has been raised during the LOD BOF at WWW 2009, but
I don't know if any possible solutions emerged from there.

The problem we are facing is that data on BBC Programmes changes
approximately 50 000 times a day (new/updated
broadcasts/versions/programmes/segments etc.). As we'd like to keep a
set of RDF crawlers up-to-date with our information we were wondering
how best to ping these. pingthesemanticweb seems like a nice option,
but it needs the crawlers to ping it often enough to make sure they
didn't miss a change. Another solution we were thinking of would be to
stick either Talis changesets [1] or SPARQL/Update statements in a
message queue, which would then be consumed by the crawlers.

Did anyone tried to tackle this problem already?


[1] http://n2.talis.com/wiki/Changeset
Received on Tuesday, 28 April 2009 13:40:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:20:46 UTC