W3C home > Mailing lists > Public > public-lod@w3.org > March 2011

Re: The truth about SPARQL Endpoint availability

From: Hugh Glaser <hg@ecs.soton.ac.uk>
Date: Sat, 5 Mar 2011 15:14:33 +0000
To: Andrea Splendiani <andrea.splendiani@bbsrc.ac.uk>
CC: Pierre-Yves Vandenbussche <py.vandenbussche@gmail.com>, "<public-lod@w3.org>" <public-lod@w3.org>, SW-forum <semantic-web@w3.org>, "<semanticweb@yahoogroups.com>" <semanticweb@yahoogroups.com>
Message-ID: <EMEW3|ce88fae0362d9173423547ba0af3418dn24FF102hg|ecs.soton.ac.uk|3929CA75-524B-41FC-AA9C-7CD0FF38BF0A@ecs.soton.ac.uk>
Hi,
On 5 Mar 2011, at 14:22, Andrea Splendiani wrote:

> Hi,
> 
> I think it depends on the store, I've tried some (from the endpoint list) and some returns a answer pretty quickly. Some doesn't and some doesn't support count.
> However, one could have this information only for the stores that answers the count query, no need to try all time.
I am happy for a store implementor or owner to disagree, but I find it very unlikely that the owner of a store with a decent chunk of data (> 1M triples, say) would be happy for someone to keep issuing such a query, even if they did decide to give enough resources to execute it.
I would quickly blacklist such a site.
> 
> VoID:
> is this a good query:
> select * where {?s <http://rdfs.org/ns/void#numberOfTriples> ?o } 

I'm no SPARQL or voiD guru, but I think you need a bit more wrapping in the scovo stuff, so more like:

SELECT DISTINCT ?endpoint ?uri ?triples ?uris WHERE
           { ?ds a void:Dataset .
             ?ds void:sparqlEndpoint ?uri .
             ?ds rdfs:label ?endpoint .
             ?ds void:statItem [ scovo:dimension void:numberOfTriples ; rdf:value  ?triples ] .
          }

Try it at
http://kwijibo.talis.com/voiD/
or
http://void.rkbexplorer.com/

I guess Pierre-Yves might like to enhance his page by querying a voiD store to also give basic stats.
Or someone might like to do a store reporter that uses (a) voiD endpoint(s) plus Pierre-Yves's data (he has a SPARQL endpoint), to do so.
And maybe the CKAN endpoint would have extra useful data as well.
A real Semantic Web application that queried more than one SPARQL endpoint - now that would be a novelty!
Fancy the challenge, it is the weekend?! :-)

ciao
Hugh

> 
> it doesn't seem viable if so.
> 
> ciao,
> Andrea
> 
> 
> Il giorno 05/mar/2011, alle ore 13.49, Hugh Glaser ha scritto:
> 
>> NIce idea, but,... :-)
>> 
>> SELECT (count(*) as ?c) WHERE {?s ?p ?o}
>> 
>> is a pretty anti-social thing to do to a store.
>> At best, a store of any size will spend a while thinking, and then quite rightly decide they have burnt enough resources, and return some sort of error.
>> 
>> For a properly maintained site, of course, the VoiD description will give lots of similar information.
>> Best
>> Hugh
>> 
>> On 5 Mar 2011, at 13:06, Andrea Splendiani wrote:
>> 
>>> Hi, very nice!
>>> I have a small suggestion:
>>> 
>>> why don't you ask "count(*) where {?s ?p ?o}" to the endpoint ?
>>> Or ask for the number of graphs ?
>>> Both information, number of triples and number of graphs, if logged and compared over time, can give a practical view of the liveliness of the content of the endpoint.
>>> 
>>> best,
>>> Andrea Splendiani
>>> 
>>> 
>>> Il giorno 28/feb/2011, alle ore 18.55, Pierre-Yves Vandenbussche ha scritto:
>>> 
>>>> Hello all,
>>>> 
>>>> you have already encountered problems of SPARQL endpoint accessibility ?
>>>> you feel frustrated they are never available when you need them?
>>>> you develop an application using these services but wonder if it is reliable?
>>>> 
>>>> Here is a tool [1] that allows you to know public SPARQL endpoints availability and monitor them in the last hours/days. 
>>>> Stay informed of a particular (or all) endpoint status changes through RSS feeds.
>>>> All availability information generated by this tool is accessible through a SPARQL endpoint.
>>>> 
>>>> This tool fetches public SPARQL endpoints from CKAN  open data. From this list, it runs tests every hour for availability.
>>>> 
>>>> [1] http://labs.mondeca.com/sparqlEndpointsStatus/index.html
>>>> [2] http://ckan.net/
>>>> 
>>>> Pierre-Yves Vandenbussche.
>>> 
>>> Andrea Splendiani
>>> Senior Bioinformatics Scientist
>>> Centre for Mathematical and Computational Biology
>>> +44(0)1582 763133 ext 2004
>>> andrea.splendiani@bbsrc.ac.uk
>>> 
>>> 
>>> 
>> 
>> -- 
>> Hugh Glaser,  
>>             Intelligence, Agents, Multimedia
>>             School of Electronics and Computer Science,
>>             University of Southampton,
>>             Southampton SO17 1BJ
>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
>> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
>> http://www.ecs.soton.ac.uk/~hg/
>> 
>> 
> 
> Andrea Splendiani
> Senior Bioinformatics Scientist
> Centre for Mathematical and Computational Biology
> +44(0)1582 763133 ext 2004
> andrea.splendiani@bbsrc.ac.uk
> 
> 
> 

-- 
Hugh Glaser,  
              Intelligence, Agents, Multimedia
              School of Electronics and Computer Science,
              University of Southampton,
              Southampton SO17 1BJ
Work: +44 23 8059 3670, Fax: +44 23 8059 3045
Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
http://www.ecs.soton.ac.uk/~hg/
Received on Saturday, 5 March 2011 15:16:43 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:31 UTC