Re: DISTINCT() from Steve Harris on 2012-03-14 (public-rdf-dawg@w3.org from January to March 2012)

From: Steve Harris <steve.harris@garlik.com>
Date: Wed, 14 Mar 2012 11:06:19 +0000
To: "Polleres, Axel" <axel.polleres@siemens.com>
Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, "public-rdf-dawg@w3.org" <public-rdf-dawg@w3.org>
Message-Id: <3C93D1E3-CC9F-4A10-8766-5427BBE4287D@garlik.com>

On 2012-03-14, at 10:45, Polleres, Axel wrote:

> (sorry, hot "send" to early)
> 
>> Could you expand on "we need DISTINCT"?  Is that just a technical 
>> point that DISTINCT covers more or a political point about the 
>> comments?
> 
> For me this is definitly a technical point, since a DISTINCT-paths
> -semantics, which can be optimized/efficiently implemented, doesn't 
> seem to be feasible by recognizing DISTINCT subqueries alone, 
> at least not trivially...
> 
> I.e., while DISTINCT() can possibly be defined in terms of a rewriting 
> (which introduces fresh variables for blank nodes), I think that's neither 
> elegant nor very practical for optimizations without the explicit keyword, 
> whereas a syntactic element DISTINCT() gives a direct handle for 
> optimizations, right? 
> 
> So, I think this is important *both* technically and in in order 
> to address the comments.
> 
>> What about the lesser case of just {*}{+} and *+ changes?
> 
> I am fine with having those, but for the reasons above, I would 
> feel uncomfortable going without DISTINCT().

To speak bluntly, this seems crazy!

Adding both DISTINCT and {*} {+} / * + is taking an already complex feature and making it a significant challenge to implement.

- Steve

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 0535 7233 VAT # 849 0517 11
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ

Received on Wednesday, 14 March 2012 11:07:06 UTC