Re: [Ann] LODStats - Real-time Data Web Statistics

Am 02.02.2012 12:32, schrieb Richard Cyganiak:
> Congrats, this is awesome.

Thanks Richard, we are happy you like it ;-)

> So you're automatically harvesting 200+ datasets by starting with the LOD Cloud metadata we're collecting on the Data Hub (ex CKAN), leading to a total of almost 2B triples.

Exactly.

> Also fascinating is the list of 250 datasets that couldn't be automatically harvested due to SPARQL errors or errors in the RDF dumps:
> http://stats.lod2.eu/rdfdoc/?errors=1
> This is an excellent interoperability testbed and should be closely studied by anyone who's interested in the state of actual interoperability on the web of linked data (hence a CC to the Pedantic Web Group).

Yes, having an interoperability testbed and a timely view on the current
state was one of the primary reasons for developing LODStats. Some
problems might, however, also be related to incorrect CKAN metadata or
some glitches in LODStats itself - we will try to iron them out as much
as possible in the next weeks.

> One request: on http://stats.lod2.eu/stats it shows top 5 lists of various sorts (top vocabularies, classes, languages etc). Would it be possible to allow drill-down to see longer lists, let's say top 100 or top 1000? These lists are great, but the really interesting stuff often happens in the midfield.

Indeed, thats a great suggestion and will be implemented soon.

> I see VoID summaries for each individual dataset. Are they aggregated somewhere into a single file that I could SPARQL?

Not yet, but that's planned. For now it should be relatively easy to
crawl and concat the VoID files, but we will make it more convenient ;-)

> Also, how do I cite your work in publications? Is there a paper (or at least tech report) yet?

We submitted a paper, which you can cite:

Jan Demter, Sören Auer, Michael Martin, Jens Lehmann: LODStats – An
Extensible Framework for High-performance Dataset Analytics, submitted
to ESWC2012

http://svn.aksw.org/papers/2011/RDFStats/public.pdf

Best,

Sören

Received on Thursday, 2 February 2012 12:18:44 UTC