- From: Daniel Alexander Smith <ds@ecs.soton.ac.uk>
- Date: Sun, 6 Mar 2011 12:54:02 +0000
- To: Linked Data community <public-lod@w3.org>
- Cc: Bill Roberts <bill@swirrl.com>, Hugh Glaser <hg@ecs.soton.ac.uk>, Tim Berners-Lee <timbl@w3.org>
I believe that people coming from a MySQL (well MyISAM, specifically) background would assume a global COUNT to be fast, since it's a O(1) operation on a MyISAM table with a primary key.
Another way to go would be to add a NOOP command to SPARQL, surely?
Dan
On 6 Mar 2011, at 11:20, Tim Berners-Lee wrote:
> Maybe the count of triples should be special-cased in the sparql server code,
> spotted on input and the store size returned.
> if it is reasonable for the endpoint to keep track of the size of its store.
> (Do they anyway?)
>
> Tim
>
> On 2011-03 -05, at 11:58, Bill Roberts wrote:
>
>> Thanks Hugh - as someone running a couple of SPARQL endpoints, I'd certainly prefer if people don't run a global count too often (or at all). It is indeed something that makes typical SPARQL implementations work very hard.
>>
>> But it's a good reminder we should provide an alternative and i'll look into providing triple counts in voiD.
>>
>> Bill
>>
>>
>> On 5 Mar 2011, at 15:14, Hugh Glaser wrote:
>>
>>> Hi,
>>> On 5 Mar 2011, at 14:22, Andrea Splendiani wrote:
>>>
>>>> Hi,
>>>>
>>>> I think it depends on the store, I've tried some (from the endpoint list) and some returns a answer pretty quickly. Some doesn't and some doesn't support count.
>>>> However, one could have this information only for the stores that answers the count query, no need to try all time.
>>> I am happy for a store implementor or owner to disagree, but I find it very unlikely that the owner of a store with a decent chunk of data (> 1M triples, say) would be happy for someone to keep issuing such a query, even if they did decide to give enough resources to execute it.
>>> I would quickly blacklist such a site.
>>>>
>>>> VoID:
>>>> is this a good query:
>>>> select * where {?s <http://rdfs.org/ns/void#numberOfTriples> ?o }
>>>
>>> I'm no SPARQL or voiD guru, but I think you need a bit more wrapping in the scovo stuff, so more like:
>>>
>>> SELECT DISTINCT ?endpoint ?uri ?triples ?uris WHERE
>>> { ?ds a void:Dataset .
>>> ?ds void:sparqlEndpoint ?uri .
>>> ?ds rdfs:label ?endpoint .
>>> ?ds void:statItem [ scovo:dimension void:numberOfTriples ; rdf:value ?triples ] .
>>> }
>>>
>>> Try it at
>>> http://kwijibo.talis.com/voiD/
>>> or
>>> http://void.rkbexplorer.com/
>>>
>>> I guess Pierre-Yves might like to enhance his page by querying a voiD store to also give basic stats.
>>> Or someone might like to do a store reporter that uses (a) voiD endpoint(s) plus Pierre-Yves's data (he has a SPARQL endpoint), to do so.
>>> And maybe the CKAN endpoint would have extra useful data as well.
>>> A real Semantic Web application that queried more than one SPARQL endpoint - now that would be a novelty!
>>> Fancy the challenge, it is the weekend?! :-)
>>>
>>> ciao
>>> Hugh
>>>
>>>>
>>>> it doesn't seem viable if so.
>>>>
>>>> ciao,
>>>> Andrea
>>>>
>>>>
>>>> Il giorno 05/mar/2011, alle ore 13.49, Hugh Glaser ha scritto:
>>>>
>>>>> NIce idea, but,... :-)
>>>>>
>>>>> SELECT (count(*) as ?c) WHERE {?s ?p ?o}
>>>>>
>>>>> is a pretty anti-social thing to do to a store.
>>>>> At best, a store of any size will spend a while thinking, and then quite rightly decide they have burnt enough resources, and return some sort of error.
>>>>>
>>>>> For a properly maintained site, of course, the VoiD description will give lots of similar information.
>>>>> Best
>>>>> Hugh
>>>>>
>>>>> On 5 Mar 2011, at 13:06, Andrea Splendiani wrote:
>>>>>
>>>>>> Hi, very nice!
>>>>>> I have a small suggestion:
>>>>>>
>>>>>> why don't you ask "count(*) where {?s ?p ?o}" to the endpoint ?
>>>>>> Or ask for the number of graphs ?
>>>>>> Both information, number of triples and number of graphs, if logged and compared over time, can give a practical view of the liveliness of the content of the endpoint.
>>>>>>
>>>>>> best,
>>>>>> Andrea Splendiani
>>>>>>
>>>>>>
>>>>>> Il giorno 28/feb/2011, alle ore 18.55, Pierre-Yves Vandenbussche ha scritto:
>>>>>>
>>>>>>> Hello all,
>>>>>>>
>>>>>>> you have already encountered problems of SPARQL endpoint accessibility ?
>>>>>>> you feel frustrated they are never available when you need them?
>>>>>>> you develop an application using these services but wonder if it is reliable?
>>>>>>>
>>>>>>> Here is a tool [1] that allows you to know public SPARQL endpoints availability and monitor them in the last hours/days.
>>>>>>> Stay informed of a particular (or all) endpoint status changes through RSS feeds.
>>>>>>> All availability information generated by this tool is accessible through a SPARQL endpoint.
>>>>>>>
>>>>>>> This tool fetches public SPARQL endpoints from CKAN open data. From this list, it runs tests every hour for availability.
>>>>>>>
>>>>>>> [1] http://labs.mondeca.com/sparqlEndpointsStatus/index.html
>>>>>>> [2] http://ckan.net/
>>>>>>>
>>>>>>> Pierre-Yves Vandenbussche.
>>>>>>
>>>>>> Andrea Splendiani
>>>>>> Senior Bioinformatics Scientist
>>>>>> Centre for Mathematical and Computational Biology
>>>>>> +44(0)1582 763133 ext 2004
>>>>>> andrea.splendiani@bbsrc.ac.uk
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> Hugh Glaser,
>>>>> Intelligence, Agents, Multimedia
>>>>> School of Electronics and Computer Science,
>>>>> University of Southampton,
>>>>> Southampton SO17 1BJ
>>>>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
>>>>> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
>>>>> http://www.ecs.soton.ac.uk/~hg/
>>>>>
>>>>>
>>>>
>>>> Andrea Splendiani
>>>> Senior Bioinformatics Scientist
>>>> Centre for Mathematical and Computational Biology
>>>> +44(0)1582 763133 ext 2004
>>>> andrea.splendiani@bbsrc.ac.uk
>>>>
>>>>
>>>>
>>>
>>> --
>>> Hugh Glaser,
>>> Intelligence, Agents, Multimedia
>>> School of Electronics and Computer Science,
>>> University of Southampton,
>>> Southampton SO17 1BJ
>>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045
>>> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652
>>> http://www.ecs.soton.ac.uk/~hg/
>>>
>>>
>>>
>>
>>
>>
>
>
Received on Sunday, 6 March 2011 12:54:35 UTC