- From: David Booth <david@dbooth.org>
- Date: Mon, 03 Mar 2014 17:00:46 -0500
- To: Andy Seaborne <andy@apache.org>, "w3.hcls@gmail.com" <w3.hcls@gmail.com>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Hi Andy, On 03/03/2014 03:01 PM, Andy Seaborne wrote: > (please forward if the mailing list does not allow non-subscribers to > send to it) > > On 03/03/14 16:32, David Booth wrote: >> On 02/09/2014 05:45 PM, w3.hcls@gmail.com wrote: >>> Relevant docs: >>> - Working draft of W3C Note: >>> https://docs.google.com/document/d/1zGQJ9bO_dSc8taINTNHdnjYEzUyYkbjglrcuUPuoITw/edit#heading=h.wyc73yp7c8jz >>> >>> >> >> I notice that section 6.6.1 Core statistics shows this SPARQL query for >> counting the number of triples: >> >> SELECT (COUNT(*) AS ?no) { ?s ?p ?o } >> >> However, I believe the SPARQL 1.1 standard allows duplicate triples and >> duplicate query solutions by default. If so, to get an accurate count >> of the number of triples, the DISTINCT keyword must be used: >> >> SELECT (COUNT(DISTINCT *) AS ?no) { ?s ?p ?o } >> >> I'm copying Andy Seaborne to see if this is correct, since I could not >> easily find this information in the SPARQL 1.1 spec when I did a quick >> scan. Andy, am I correct about this? >> >> Thanks, >> David > > Hi, > > In the case of { ?s ?p ?o }, the match is against the default graph and > an RDF graph is a set of triples - so there are no duplicates over the > ?s, ?p, ?o elements of a row. > > Because of the nature of the pattern, COUNT(*) and COUNT(DISTINCT *) > should be the same. I'm particularly thinking of AllegroGraph, which (by default I believe) does not remove duplicate triples if the same triple happens to be loaded more than once. If AllegroGraph returns a different count to the queries above (with or without DISTINCT), does that mean that AllegroGraph is not SPARQL 1.1 compliant? I.e., is it a bug, or is it a permissible implementation variation? I had the impression that SPARQL 1.1 conformant implementations are permitted to have duplicate solutions in the solution set unless the word DISTINCT is used, and hence I would have thought that a solution set that is not explicitly constrained to be DISTINCT could include duplicates, even if that solution set is for only a { ?s ?p ?o } graph pattern over the default graph, but maybe I'm wrong. OTOH, if, when DISTINCT is not specified, the SPARQL 1.1 standard only *sometimes* permits duplicates, then how can I determine which circumstances permit them and which don't? David
Received on Monday, 3 March 2014 22:01:15 UTC