W3C home > Mailing lists > Public > public-lod@w3.org > February 2012

Re: [Ann] LODStats - Real-time Data Web Statistics

From: Sören Auer <auer@informatik.uni-leipzig.de>
Date: Tue, 21 Feb 2012 15:51:35 +0100
Message-ID: <4F43AF77.5070607@informatik.uni-leipzig.de>
To: Rinke Hoekstra <hoekstra@few.vu.nl>
CC: "public-lod@w3.org" <public-lod@w3.org>, "pedantic-web@googlegroups.com" <pedantic-web@googlegroups.com>
Am 21.02.2012 15:38, schrieb Rinke Hoekstra:
> However... is it me, or isn't the 'almost 2B triples' a very
> disappointing number? If you go through all datasets advertised on the
> Data Hub, the advertised number of triples is over 40B ! This means
> that only one out of 20 triples in the linked 'open' data cloud is
> publicly accessible.

It certainly is and this is one of the reasons we developed this tool to
get a better picture of the LOD cloud. Of cause this difference is
partially caused by invalid links in CKAN and some issues we still have
with dealing with very large datasets, but these issues real users might
have as well.

> Another thing... it seems as if LODStats is merely checking whether a
> SPARQL endpoint is 'up' and whether the endpoint actually contains the
> data that has been advertised on the Data Hub. For instance, my very
> own bubble is listed without problems, but I know for a fact that the
> triple store no longer contains the data (sorry!). Do you have any
> thoughts/ideas on how to detect such problems?

We currently don't delete our stats when an endpoint is not available
once, but try to check back later. Of course after a certain number of
check backs and timeouts the stats should be invalidated. Can you point
me to your endpoint and we will have a look what's the problem there.

Best,

Sören
Received on Tuesday, 21 February 2012 14:51:57 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:37 UTC