- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 11 Mar 2007 14:45:37 -0400
- To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
- Message-ID: <20070311184537.GA21623@w3.org>
This is pursuant to LeeF's request to summarize DISTINCT/LOOSE/ALL.
There should be no contentious opinions in this message.
The commenter who started this thread sited modifer-limit [ML]. I
am using a variant here:
Data:
@prefix : <http://example.org/ns#> .
:x :num 1 .
:x :num 2 .
:y :num 1 .
:z :num 1 .
Query:
PREFIX : <http://example.org/ns#>
SELECT ?num
WHERE { [] :num ?num }
ORDER BY ASC(?num) <some DISTINCTion> ... LIMIT 3
Results vary by DISTINCTness semantics:
ALL DISTINCT LOOSE CHOOSE
num num num num num num num
"1" "1" "1" or "1" or "1" "1" or "1"
"1" "2" "1" "1" "2" "1" "2"
"1" "1" "2" "1"
ALL:
most like default SQL semantics (though unlike SQL UNION).
- most verbose/computationally exhaustive.
+ allows post-processing aggregates.
+ encourages implementors to be ready for aggregate additions to SPARQL.
DISTINCT:
very much like SQL DISTINCT.
+ least verbose.
+ clear semantics.
- requires hashing of sent values (less added cost when used with ORDER).
LOOSE:
+ good enough for most queries.
+ optimizable for hashing/transmission tradeoffs.
- contributes to non-portability of queries with slices.
- could surprise SQL-heads.
CHOOSE:
+ slightly clearer semantics than LOOSE
+ slightly more testable than LOOSE?
We need to choose some or all of:
Which of these do we offer?
What are the default semantics?
What would be good keywords for LOOSE and CHOOSE?
The proposals from the meeting *:
default keywords
1 ALL DISTINCT +1*(AndyS, ericP) -1*(SimonR)=+1
2 ALL DISTINCT, LOOSE +1*(Souri, ericP) -1*(SimonR) +.9*(SteveH) +.1*(PatH)=+2
3 LOOSE DISTINCT +.5*(SimonR) -1*(ericP)=-.5
4 LOOSE DISTINCT, ALL +1*(SteveH, ericP)=+2
5 DISTINCT +1*(SimonR) -1*(SteveH, ericP)=-1
* taking PatH's "mild preference" as +.1
Currently, the domain for DISTINCTness is all the returned variables
(a subset of those mentioned in the query pattern). SimonR is
looking for a motivating use case where the domain is a different
set.
[ML] http://www.w3.org/2001/sw/DataAccess/tests/#modifer-limit
--
-eric
office: +1.617.258.5741 NE43-344, MIT, Cambridge, MA 02144 USA
cell: +1.857.222.5741
(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Sunday, 11 March 2007 18:45:45 UTC