Re: [Update] [LLD] Dataset Description from M. Scott Marshall on 2014-03-03 (public-semweb-lifesci@w3.org from March 2014)

From: M. Scott Marshall <mscottmarshall@gmail.com>
Date: Mon, 3 Mar 2014 21:53:00 +0100
To: Andy Seaborne <andy@apache.org>
Cc: David Booth <david@dbooth.org>, "w3.hcls@gmail.com" <w3.hcls@gmail.com>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Message-ID: <CACHzV2OQfpVU=i2fHm+KACMp8LA_dtWwr-uR=3Bo+b1UkR=+Yg@mail.gmail.com>

Thank you David and Andy. We appreciate all feedback, as well as testing,
prodding, poking, etc.

Thanks again,
Scott


On Mon, Mar 3, 2014 at 9:01 PM, Andy Seaborne <andy@apache.org> wrote:

> (please forward if the mailing list does not allow non-subscribers to send
> to it)
>
>
> On 03/03/14 16:32, David Booth wrote:
>
>> On 02/09/2014 05:45 PM, w3.hcls@gmail.com wrote:
>>
>>> Relevant docs:
>>> - Working draft of W3C Note:
>>> https://docs.google.com/document/d/1zGQJ9bO_
>>> dSc8taINTNHdnjYEzUyYkbjglrcuUPuoITw/edit#heading=h.wyc73yp7c8jz
>>>
>>>
>> I notice that section 6.6.1 Core statistics shows this SPARQL query for
>> counting the number of triples:
>>
>>    SELECT (COUNT(*) AS ?no) { ?s ?p ?o  }
>>
>> However, I believe the SPARQL 1.1 standard allows duplicate triples and
>> duplicate query solutions by default.  If so, to get an accurate count
>> of the number of triples, the DISTINCT keyword must be used:
>>
>>    SELECT (COUNT(DISTINCT *) AS ?no) { ?s ?p ?o  }
>>
>> I'm copying Andy Seaborne to see if this is correct, since I could not
>> easily find this information in the SPARQL 1.1 spec when I did a quick
>> scan.   Andy, am I correct about this?
>>
>> Thanks,
>> David
>>
>
> Hi,
>
> In the case of { ?s ?p ?o }, the match is against the default graph and an
> RDF graph is a set of triples - so there are no duplicates over the ?s, ?p,
> ?o elements of a row.
>
> Because of the nature of the pattern, COUNT(*) and COUNT(DISTINCT *)
> should be the same.
>
>
>
> One suggestion looking at:
>
> SELECT (COUNT(DISTINCT ?g ) AS ?no) { GRAPH ?g { ?s ?p ?o}}
>
> which can be written as:
>
> SELECT (COUNT(?g) AS ?no) { GRAPH ?g { } }
>
> because "GRAPH ?g { }" results in all the graph names, one per row, and
> the graph names are distinct so there is no need for DISTINCT in the COUNT.
>
>         Andy
>

-- 
M. Scott Marshall, PhD
MAASTRO clinic, http://www.maastro.nl/en/1/
http://eurecaproject.eu/
http://semantic-dicom.org/
https://plus.google.com/u/0/114642613065018821852/posts
http://www.linkedin.com/pub/m-scott-marshall/5/464/a22

Received on Monday, 3 March 2014 20:53:29 UTC