- From: Pavel Klinov <pavel.klinov@gmail.com>
- Date: Sun, 9 Oct 2022 21:43:05 +0200
- To: Miguel <miguel.ceriani@gmail.com>
- Cc: Andy Seaborne <andy@apache.org>, public-sparql-dev@w3.org
Hi Miguel, Yeah, I agree. Actually this issue was known (long?) before SPARQL, here's a great writeup in the SQL context: https://blog.jooq.org/how-sql-distinct-and-order-by-are-related/ Cheers, Pavel On Sat, Oct 8, 2022 at 2:01 PM Miguel <miguel.ceriani@gmail.com> wrote: > > Hi, > I noticed this issue some time ago too and tested that different engines give different results. > I do not think this is a simple matter of wording (consider that ordering by a variable that is not projected makes perfectly sense, the issue arises only when also DISTINCT is used). > > Miguel > > On Sat, 8 Oct 2022, 13:34 Andy Seaborne, <andy@apache.org> wrote: >> >> Hi Pavel, >> >> Hmm - that text isn't good. The intent is, I guess, when >> distinct/project are same set of expressions as the orderby. >> >> I've recorded the issue >> >> https://www.w3.org/2013/sparql-errata#errata-query-20 >> >> Andy >> >> On 07/10/2022 15:38, Pavel Klinov wrote: >> > Hi all, >> > >> > Sorry if this is known but was a little surprise to me. Even though >> > both Projection and Distinct solution modifiers are required to >> > preserve the order of solutions imposed by Order By (see 18.5), one >> > can construct an example where the final query results would be >> > undefined: >> > >> > SELECT DISTINCT ?a { >> > VALUES (?a ?b) { (1 1) (1 2) (2 3) (1 4) (2 5) } >> > } >> > ORDER BY DESC(?b) >> > >> > Solution modifiers are applied in the order of: Order By -> Projection >> > -> Distinct (15). So after the projection, the solution sequence is: >> > ?a -> 2, ?a -> 1, ?a -> 2, ?a -> 1, ?a -> 1. >> > >> > Now, the Distinct is only required to keep this order but it's free to >> > remove any of the duplicate ?a -> 2 or ?a -> 1 solutions. So the final >> > results could be either ?a -> 2, ?a -> 1 or ?a -> 1, ?a -> 2. Note >> > that both solution sequences preserve the Order By order! >> > >> > It's easy to make an extended example with LIMIT where the results >> > could be completely different based on how Distinct eliminates >> > duplicates. Given the role preservation requirement one can argue that >> > Distinct should always keep the first occurrence of each duplicate in >> > the input, but I don't think it's in the spec. >> > >> > Am I missing something? >> > Pavel >> > >> > >>
Received on Sunday, 9 October 2022 19:43:29 UTC