W3C home > Mailing lists > Public > public-rdf-dawg-comments@w3.org > October 2004

Re: Counting, Ordering and DISTINCT

From: Jeen Broekstra <jeen@aduna.biz>
Date: Wed, 20 Oct 2004 14:46:09 +0200
Message-ID: <41765E11.7070209@aduna.biz>
To: Andrew Newman <andrew@tucanatech.com>
Cc: public-rdf-dawg-comments@w3.org

Andrew Newman wrote:


> The other issue with the SPARQL is the lack of an implicit 
> distinct.  In my understand of SQL, DISTINCT is optional because if
>  your queries work on normalized data and joins are based on 
> distinct keys then the returned results cannot be duplicated.  If 
> your query works on rows with repeated values on the same column 
> then you apply DISTINCT.
> In RDF's data model there isn't really this problem of duplicated 
> data and normalization.  SPARQL has the idea of matching statements
>  in the graph.  From my understanding, RDF's data model doesn't 
> support the idea of multiple subject, predicates and/or objects 
> with the same values.
> In other words, it only seems valid that if a query matches one 
> result in the graph it should return that one unique result not 
> repeated multiple results.
> While I can see many use cases for distinct vs non-distinct results
>  I am not aware of a reason to return non-distinct results over 
> distinct results.  Have I missed something?

I can not answer for the DAWG of course, but a possible reason (and
indeed the reason that we have made the same choice in Sesame's SeRQL
language), is that processing of a query result to filter out
duplicates is potentially expensive. If for the purposes of the
querying client it is not a problem that duplicates are present in the
query (and this is quite often the case, in our experience, especially
in CONSTRUCT queries), then why filter them out at all?

Jeen Broekstra          Aduna BV
Knowledge Engineer      Julianaplein 14b, 3817 CS Amersfoort
http://aduna.biz        The Netherlands
tel. +31(0)33 46599877  fax. +31(0)33 46599877
Received on Wednesday, 20 October 2004 12:45:57 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 20:52:05 UTC