- From: Jeen Broekstra <jeen@aduna.biz>
- Date: Wed, 20 Oct 2004 14:46:09 +0200
- To: Andrew Newman <andrew@tucanatech.com>
- Cc: public-rdf-dawg-comments@w3.org
Andrew Newman wrote: [snip] > The other issue with the SPARQL is the lack of an implicit > distinct. In my understand of SQL, DISTINCT is optional because if > your queries work on normalized data and joins are based on > distinct keys then the returned results cannot be duplicated. If > your query works on rows with repeated values on the same column > then you apply DISTINCT. > > In RDF's data model there isn't really this problem of duplicated > data and normalization. SPARQL has the idea of matching statements > in the graph. From my understanding, RDF's data model doesn't > support the idea of multiple subject, predicates and/or objects > with the same values. > > In other words, it only seems valid that if a query matches one > result in the graph it should return that one unique result not > repeated multiple results. > > While I can see many use cases for distinct vs non-distinct results > I am not aware of a reason to return non-distinct results over > distinct results. Have I missed something? I can not answer for the DAWG of course, but a possible reason (and indeed the reason that we have made the same choice in Sesame's SeRQL language), is that processing of a query result to filter out duplicates is potentially expensive. If for the purposes of the querying client it is not a problem that duplicates are present in the query (and this is quite often the case, in our experience, especially in CONSTRUCT queries), then why filter them out at all? Jeen -- Jeen Broekstra Aduna BV Knowledge Engineer Julianaplein 14b, 3817 CS Amersfoort http://aduna.biz The Netherlands tel. +31(0)33 46599877 fax. +31(0)33 46599877
Received on Wednesday, 20 October 2004 12:45:57 UTC