[Fwd: Unexpected DISTINCT?] from Seaborne, Andy on 2007-02-26 (public-rdf-dawg@w3.org from January to March 2007)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Mon, 26 Feb 2007 16:03:08 +0000
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <45E304BC.6040307@hp.com>

The modifier order is:

     * 9.1 ORDER BY
     * 9.2 Projection
     * 9.3 DISTINCT
     * 9.4 OFFSET
     * 9.5 LIMIT

so the test is correct.

We don't document anywhere (IIRC) anything about auto DISTINCT.

When DISTINCT is applied after ORDER.  The ORDER step emits [1, 1, 2, ...] so

limit(
    distinct([1, 1, 2, ...]),
    2)
    = [1, 2]


The question of implicit DISTINCT remains -

Any opinions of saying anything about implicit DISTINCT for simple entailment 
(all we define SPARQL for).  Because DISTINCT is after projection, there are 
several ways to get duplicates, all of which are well-defined within BGP 
matching (blank nodes for simple entailment matches), the algebra (UNION), and 
projection.  Just projection alone suggests to be that we should not define 
implicit DISTINCT and leave it to implementations to provide as an extra but I 
don't have a strong opinion to that effect.

	Andy

-------- Original Message --------
Subject: Unexpected DISTINCT?
Resent-Date: Sun, 25 Feb 2007 17:58:17 +0000
Resent-From: public-rdf-dawg-comments@w3.org
Date: Sat, 24 Feb 2007 23:27:53 -0800
From: Richard Newman <rnewman@franz.com>
To: public-rdf-dawg-comments@w3.org


DAWG,

    I have an implementation question for which I cannot find an
answer in the spec.

    Given a SELECT query for which some results are duplicated, and
which does not specify DISTINCT, is it acceptable for an
implementation to return DISTINCT (or partially DISTINCT) results?

    This is exercised by <http://www.w3.org/2001/sw/DataAccess/tests/
#modifer-limit>:

- with no DISTINCT processing, the results are [ 1, 1 ].
- with DISTINCT processing, the results are [ 1, 2 ].

    I seem to recall from informal sources that this is acceptable,
but it would be good to get a firm documented answer, particularly
when I can see that this could be contentious.

    Thanks,

-R

Received on Monday, 26 February 2007 16:03:59 UTC