- From: Pavel Klinov <pavel.klinov@gmail.com>
- Date: Sun, 9 Oct 2022 21:40:32 +0200
- To: Andy Seaborne <andy@apache.org>
- Cc: public-sparql-dev@w3.org
Thanks Andy, I agree re: the intent, but if all variables used in the Order By expressions are projected, it shouldn't even matter if Distinct/Project are applied before Order By or after. In fact, systems using dictionary encoding of triples in its internal representation (a-la RDF-3x) often push Distinct/Projection under Order By because Distinct can be implemented over tuples of numbers while Order By requires comparison over IRIs and literals. In other words, the explicit logical order of solution modifiers defined in https://www.w3.org/TR/sparql11-query/#solutionModifiers is only important when the projection eliminates one of orderby variables. Cheers, Pavel On Sat, Oct 8, 2022 at 1:34 PM Andy Seaborne <andy@apache.org> wrote: > > Hi Pavel, > > Hmm - that text isn't good. The intent is, I guess, when > distinct/project are same set of expressions as the orderby. > > I've recorded the issue > > https://www.w3.org/2013/sparql-errata#errata-query-20 > > Andy > > On 07/10/2022 15:38, Pavel Klinov wrote: > > Hi all, > > > > Sorry if this is known but was a little surprise to me. Even though > > both Projection and Distinct solution modifiers are required to > > preserve the order of solutions imposed by Order By (see 18.5), one > > can construct an example where the final query results would be > > undefined: > > > > SELECT DISTINCT ?a { > > VALUES (?a ?b) { (1 1) (1 2) (2 3) (1 4) (2 5) } > > } > > ORDER BY DESC(?b) > > > > Solution modifiers are applied in the order of: Order By -> Projection > > -> Distinct (15). So after the projection, the solution sequence is: > > ?a -> 2, ?a -> 1, ?a -> 2, ?a -> 1, ?a -> 1. > > > > Now, the Distinct is only required to keep this order but it's free to > > remove any of the duplicate ?a -> 2 or ?a -> 1 solutions. So the final > > results could be either ?a -> 2, ?a -> 1 or ?a -> 1, ?a -> 2. Note > > that both solution sequences preserve the Order By order! > > > > It's easy to make an extended example with LIMIT where the results > > could be completely different based on how Distinct eliminates > > duplicates. Given the role preservation requirement one can argue that > > Distinct should always keep the first occurrence of each duplicate in > > the input, but I don't think it's in the spec. > > > > Am I missing something? > > Pavel > > > > >
Received on Sunday, 9 October 2022 19:40:56 UTC