W3C home > Mailing lists > Public > public-lod@w3.org > March 2011

Re: The truth about SPARQL Endpoint availability

From: Bill Roberts <bill@swirrl.com>
Date: Sat, 5 Mar 2011 17:58:13 +0000
Message-Id: <79210CD9-7F23-461C-9B3D-786BB20C7276@swirrl.com>
To: Hugh Glaser <hg@ecs.soton.ac.uk>, Linked Data community <public-lod@w3.org>
Thanks Hugh - as someone running a couple of SPARQL endpoints, I'd certainly prefer if people don't run a global count too often (or at all). It is indeed something that makes typical SPARQL implementations work very hard.

But it's a good reminder we should provide an alternative and i'll look into providing triple counts in voiD.

Bill


On 5 Mar 2011, at 15:14, Hugh Glaser wrote:

> Hi,
> On 5 Mar 2011, at 14:22, Andrea Splendiani wrote:
> 
>> Hi,
>> 
>> I think it depends on the store, I've tried some (from the endpoint list) and some returns a answer pretty quickly. Some doesn't and some doesn't support count.
>> However, one could have this information only for the stores that answers the count query, no need to try all time.
> I am happy for a store implementor or owner to disagree, but I find it very unlikely that the owner of a store with a decent chunk of data (> 1M triples, say) would be happy for someone to keep issuing such a query, even if they did decide to give enough resources to execute it.
> I would quickly blacklist such a site.
>> 
>> VoID:
>> is this a good query:
>> select * where {?s <http://rdfs.org/ns/void#numberOfTriples> ?o } 
> 
> I'm no SPARQL or voiD guru, but I think you need a bit more wrapping in the scovo stuff, so more like:
> 
> SELECT DISTINCT ?endpoint ?uri ?triples ?uris WHERE
>           { ?ds a void:Dataset .
>             ?ds void:sparqlEndpoint ?uri .
>             ?ds rdfs:label ?endpoint .
>             ?ds void:statItem [ scovo:dimension void:numberOfTriples ; rdf:value  ?triples ] .
>          }
> 
> Try it at
> http://kwijibo.talis.com/voiD/
> or
> http://void.rkbexplorer.com/
> 
> I guess Pierre-Yves might like to enhance his page by querying a voiD store to also give basic stats.
> Or someone might like to do a store reporter that uses (a) voiD endpoint(s) plus Pierre-Yves's data (he has a SPARQL endpoint), to do so.
> And maybe the CKAN endpoint would have extra useful data as well.
> A real Semantic Web application that queried more than one SPARQL endpoint - now that would be a novelty!
> Fancy the challenge, it is the weekend?! :-)
> 
> ciao
> Hugh
> 
>> 
>> it doesn't seem viable if so.
>> 
>> ciao,
>> Andrea
>> 
>> 
>> Il giorno 05/mar/2011, alle ore 13.49, Hugh Glaser ha scritto:
>> 
>>> NIce idea, but,... :-)
>>> 
>>> SELECT (count(*) as ?c) WHERE {?s ?p ?o}
>>> 
>>> is a pretty anti-social thing to do to a store.
>>> At best, a store of any size will spend a while thinking, and then quite rightly decide they have burnt enough resources, and return some sort of error.
>>> 
>>> For a properly maintained site, of course, the VoiD description will give lots of similar information.
>>> Best
>>> Hugh
>>> 
>>> On 5 Mar 2011, at 13:06, Andrea Splendiani wrote:
>>> 
>>>> Hi, very nice!
>>>> I have a small suggestion:
>>>> 
>>>> why don't you ask "count(*) where {?s ?p ?o}" to the endpoint ?
>>>> Or ask for the number of graphs ?
>>>> Both information, number of triples and number of graphs, if logged and compared over time, can give a practical view of the liveliness of the content of the endpoint.
>>>> 
>>>> best,
>>>> Andrea Splendiani
>>>> 
>>>> 
>>>> Il giorno 28/feb/2011, alle ore 18.55, Pierre-Yves Vandenbussche ha scritto:
>>>> 
>>>>> Hello all,
>>>>> 
>>>>> you have already encountered problems of SPARQL endpoint accessibility ?
>>>>> you feel frustrated they are never available when you need them?
>>>>> you develop an application using these services but wonder if it is reliable?
>>>>> 
>>>>> Here is a tool [1] that allows you to know public SPARQL endpoints availability and monitor them in the last hours/days. 
>>>>> Stay informed of a particular (or all) endpoint status changes through RSS feeds.
>>>>> All availability information generated by this tool is accessible through a SPARQL endpoint.
>>>>> 
>>>>> This tool fetches public SPARQL endpoints from CKAN  open data. From this list, it runs tests every hour for availability.
>>>>> 
>>>>> [1] http://labs.mondeca.com/sparqlEndpointsStatus/index.html
>>>>> [2] http://ckan.net/
>>>>> 
>>>>> Pierre-Yves Vandenbussche.
>>>> 
>>>> Andrea Splendiani
>>>> Senior Bioinformatics Scientist
>>>> Centre for Mathematical and Computational Biology
>>>> +44(0)1582 763133 ext 2004
>>>> andrea.splendiani@bbsrc.ac.uk
>>>> 
>>>> 
>>>> 
>>> 
>>> -- 
>>> Hugh Glaser,  
>>>            Intelligence, Agents, Multimedia
>>>            School of Electronics and Computer Science,
>>>            University of Southampton,
>>>            Southampton SO17 1BJ
>>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
>>> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
>>> http://www.ecs.soton.ac.uk/~hg/
>>> 
>>> 
>> 
>> Andrea Splendiani
>> Senior Bioinformatics Scientist
>> Centre for Mathematical and Computational Biology
>> +44(0)1582 763133 ext 2004
>> andrea.splendiani@bbsrc.ac.uk
>> 
>> 
>> 
> 
> -- 
> Hugh Glaser,  
>              Intelligence, Agents, Multimedia
>              School of Electronics and Computer Science,
>              University of Southampton,
>              Southampton SO17 1BJ
> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
> http://www.ecs.soton.ac.uk/~hg/
> 
> 
> 
Received on Saturday, 5 March 2011 17:58:53 UTC

This archive was generated by hypermail 2.3.1 : Sunday, 31 March 2013 14:24:31 UTC