- From: Daniel Alexander Smith <ds@ecs.soton.ac.uk>
- Date: Sun, 6 Mar 2011 12:54:02 +0000
- To: Linked Data community <public-lod@w3.org>
- Cc: Bill Roberts <bill@swirrl.com>, Hugh Glaser <hg@ecs.soton.ac.uk>, Tim Berners-Lee <timbl@w3.org>
I believe that people coming from a MySQL (well MyISAM, specifically) background would assume a global COUNT to be fast, since it's a O(1) operation on a MyISAM table with a primary key. Another way to go would be to add a NOOP command to SPARQL, surely? Dan On 6 Mar 2011, at 11:20, Tim Berners-Lee wrote: > Maybe the count of triples should be special-cased in the sparql server code, > spotted on input and the store size returned. > if it is reasonable for the endpoint to keep track of the size of its store. > (Do they anyway?) > > Tim > > On 2011-03 -05, at 11:58, Bill Roberts wrote: > >> Thanks Hugh - as someone running a couple of SPARQL endpoints, I'd certainly prefer if people don't run a global count too often (or at all). It is indeed something that makes typical SPARQL implementations work very hard. >> >> But it's a good reminder we should provide an alternative and i'll look into providing triple counts in voiD. >> >> Bill >> >> >> On 5 Mar 2011, at 15:14, Hugh Glaser wrote: >> >>> Hi, >>> On 5 Mar 2011, at 14:22, Andrea Splendiani wrote: >>> >>>> Hi, >>>> >>>> I think it depends on the store, I've tried some (from the endpoint list) and some returns a answer pretty quickly. Some doesn't and some doesn't support count. >>>> However, one could have this information only for the stores that answers the count query, no need to try all time. >>> I am happy for a store implementor or owner to disagree, but I find it very unlikely that the owner of a store with a decent chunk of data (> 1M triples, say) would be happy for someone to keep issuing such a query, even if they did decide to give enough resources to execute it. >>> I would quickly blacklist such a site. >>>> >>>> VoID: >>>> is this a good query: >>>> select * where {?s <http://rdfs.org/ns/void#numberOfTriples> ?o } >>> >>> I'm no SPARQL or voiD guru, but I think you need a bit more wrapping in the scovo stuff, so more like: >>> >>> SELECT DISTINCT ?endpoint ?uri ?triples ?uris WHERE >>> { ?ds a void:Dataset . >>> ?ds void:sparqlEndpoint ?uri . >>> ?ds rdfs:label ?endpoint . >>> ?ds void:statItem [ scovo:dimension void:numberOfTriples ; rdf:value ?triples ] . >>> } >>> >>> Try it at >>> http://kwijibo.talis.com/voiD/ >>> or >>> http://void.rkbexplorer.com/ >>> >>> I guess Pierre-Yves might like to enhance his page by querying a voiD store to also give basic stats. >>> Or someone might like to do a store reporter that uses (a) voiD endpoint(s) plus Pierre-Yves's data (he has a SPARQL endpoint), to do so. >>> And maybe the CKAN endpoint would have extra useful data as well. >>> A real Semantic Web application that queried more than one SPARQL endpoint - now that would be a novelty! >>> Fancy the challenge, it is the weekend?! :-) >>> >>> ciao >>> Hugh >>> >>>> >>>> it doesn't seem viable if so. >>>> >>>> ciao, >>>> Andrea >>>> >>>> >>>> Il giorno 05/mar/2011, alle ore 13.49, Hugh Glaser ha scritto: >>>> >>>>> NIce idea, but,... :-) >>>>> >>>>> SELECT (count(*) as ?c) WHERE {?s ?p ?o} >>>>> >>>>> is a pretty anti-social thing to do to a store. >>>>> At best, a store of any size will spend a while thinking, and then quite rightly decide they have burnt enough resources, and return some sort of error. >>>>> >>>>> For a properly maintained site, of course, the VoiD description will give lots of similar information. >>>>> Best >>>>> Hugh >>>>> >>>>> On 5 Mar 2011, at 13:06, Andrea Splendiani wrote: >>>>> >>>>>> Hi, very nice! >>>>>> I have a small suggestion: >>>>>> >>>>>> why don't you ask "count(*) where {?s ?p ?o}" to the endpoint ? >>>>>> Or ask for the number of graphs ? >>>>>> Both information, number of triples and number of graphs, if logged and compared over time, can give a practical view of the liveliness of the content of the endpoint. >>>>>> >>>>>> best, >>>>>> Andrea Splendiani >>>>>> >>>>>> >>>>>> Il giorno 28/feb/2011, alle ore 18.55, Pierre-Yves Vandenbussche ha scritto: >>>>>> >>>>>>> Hello all, >>>>>>> >>>>>>> you have already encountered problems of SPARQL endpoint accessibility ? >>>>>>> you feel frustrated they are never available when you need them? >>>>>>> you develop an application using these services but wonder if it is reliable? >>>>>>> >>>>>>> Here is a tool [1] that allows you to know public SPARQL endpoints availability and monitor them in the last hours/days. >>>>>>> Stay informed of a particular (or all) endpoint status changes through RSS feeds. >>>>>>> All availability information generated by this tool is accessible through a SPARQL endpoint. >>>>>>> >>>>>>> This tool fetches public SPARQL endpoints from CKAN open data. From this list, it runs tests every hour for availability. >>>>>>> >>>>>>> [1] http://labs.mondeca.com/sparqlEndpointsStatus/index.html >>>>>>> [2] http://ckan.net/ >>>>>>> >>>>>>> Pierre-Yves Vandenbussche. >>>>>> >>>>>> Andrea Splendiani >>>>>> Senior Bioinformatics Scientist >>>>>> Centre for Mathematical and Computational Biology >>>>>> +44(0)1582 763133 ext 2004 >>>>>> andrea.splendiani@bbsrc.ac.uk >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> Hugh Glaser, >>>>> Intelligence, Agents, Multimedia >>>>> School of Electronics and Computer Science, >>>>> University of Southampton, >>>>> Southampton SO17 1BJ >>>>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045 >>>>> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652 >>>>> http://www.ecs.soton.ac.uk/~hg/ >>>>> >>>>> >>>> >>>> Andrea Splendiani >>>> Senior Bioinformatics Scientist >>>> Centre for Mathematical and Computational Biology >>>> +44(0)1582 763133 ext 2004 >>>> andrea.splendiani@bbsrc.ac.uk >>>> >>>> >>>> >>> >>> -- >>> Hugh Glaser, >>> Intelligence, Agents, Multimedia >>> School of Electronics and Computer Science, >>> University of Southampton, >>> Southampton SO17 1BJ >>> Work: +44 23 8059 3670, Fax: +44 23 8059 3045 >>> Mobile: +44 78 9422 3822, Home: +44 23 8061 5652 >>> http://www.ecs.soton.ac.uk/~hg/ >>> >>> >>> >> >> >> > >
Received on Sunday, 6 March 2011 12:54:35 UTC