- From: Seaborne, Andy <andy.seaborne@hp.com>
- Date: Wed, 28 Mar 2007 13:46:46 +0100
- To: Lee Feigenbaum <feigenbl@us.ibm.com>
- CC: dawg mailing list <public-rdf-dawg@w3.org>
Lee Feigenbaum wrote: > Andy Seaborne wrote on 03/26/2007 07:25:16 AM: >> Steve Harris wrote: >> ... >>> Whether the distinct attribute should be set where appropriate is an >>> interesting question. It also applies to SPARQL services that >>> currently implicitly DISTINCT. >> I don't see much use for a distinct attribute (I do see more utility for > the >> 'ordered'). >> >> There never was anything stated about implicitly DISTINCT - I've always > seen >> it as a local API issue where the local API inserts (or has the effect > of >> inserting) DISTINCT into all queries. It was the case the test suite >> carefully didn't distinguish - except we let such a test case in >> which is what >> started all this latest stuff into motion. > > Richard Newman has recently brought up this same issue on the -comments > list. In preparing an answer for him, I looked at the specific text in > 2.3.1 of the Query Results XML Format document: > > """ > The distinct attribute indicates that the results are distinct (contain no > duplicates), such as given by a SPARQL query using SELECT DISTINCT. > """ > > To me, this suggests that distinct="true" is only a property of the > results, and should be included whenever the results contain no > duplicates, regardless of which--if any--keywords are present in the query > itself. (I'm not thoroughly positive that this statement in the > specification implies the opposite, "If the distinct attribute's value is > false, then the results contain at least one duplicate", but it does seem > that way to me.) I have been reading this as saying @distinct=true implies "these results are distinct" but the converse is unstated. @distinct=false means there are no guarantees. If false=> at least one duplicate then streaming of results is very hard, often impossible. The code can't generate the header with @distinct until the code has seen all the solutions or it knows there solutions will be distinct anyway (e.g. SELECT * {?s ?p ?o}) > > Do any implementations that we know about behave in this way? (Set > distinct="true"/"false" solely based on the presence/absence of duplicates > in the results.) > > Lee Currently, ARQ sets @distinct=true if and only if the query had DISTINCT in it. It's independent of the results and streaming happens. I'd be happy to drop @distinct. Andy -- Hewlett-Packard Limited Registered Office: Cain Road, Bracknell, Berks RG12 1HN Registered No: 690597 England
Received on Wednesday, 28 March 2007 12:46:59 UTC