W3C home > Mailing lists > Public > public-lod@w3.org > July 2012

regularly refreshed partial LOD + Web sparql endpoint

From: Giovanni Tummarello <giovanni.tummarello@deri.org>
Date: Wed, 18 Jul 2012 18:44:25 +0200
Message-ID: <CAHHRs7hiGmfuJVBd58pLY0bVV_wWyo6QRnwxLx+PN=WEfSG3Ng@mail.gmail.com>
To: Linking Open Data <public-lod@w3.org>, Semantic Web at W3C <semantic-web@w3.org>
It might be of interest to some that in Sindice.com we switched from
trying to index all in SPARQL to a mixed approach where all appears on
the frontpage realtime but just selected Websites (rdf,rdfa,
microformats, microdaa etc) + selected LOD datasets appear in a
regularly updated (though not real time) appear in SPARQL.

This solution allows us to have a reasonable quality of service -
while fitting in our limited research resources (as Sindice.com is a
research project).

By providing this service we intend to foster experimentation by the
community that can now be sure that their favorite dataset is loaded
(just send us a request) and can be queried e.g. in SPARQL next to
their favorite web of data website (just make sure its in the list of
those indexed or send us a request).

Some details of this mechanism (and the fact that this made us process
100M rdf docs in a day) in this blog post.

A UI making all more clear is coming in august.

http://blog.sindice.com/2012/07/18/how-we-ingested-100m-semantic-documents-in-a-day-and-were-do-they-come-from/

Thanks must go to Openlink for the support provided in setting this
mechanism up and to the others mentioned in the blog post.
Gio
Received on Wednesday, 18 July 2012 16:45:13 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:41 UTC