Re: definition of INDISTINCT from Seaborne, Andy on 2007-03-22 (public-rdf-dawg@w3.org from January to March 2007)

From: Seaborne, Andy <andy.seaborne@hp.com>
Date: Thu, 22 Mar 2007 18:24:11 +0000
To: Eric Prud'hommeaux <eric@w3.org>
Cc: Jeen Broekstra <j.broekstra@tue.nl>, public-rdf-dawg@w3.org
Message-ID: <4602C9CB.9030500@hp.com>
Eric Prud'hommeaux wrote:
> * Seaborne, Andy <andy.seaborne@hp.com> [2007-03-18 18:01+0000]
>>
>>
>> Eric Prud'hommeaux wrote:
>>> * Jeen Broekstra <j.broekstra@tue.nl> [2007-03-16 17:39+0100]
>>>> Alright, nitpicking a bit:
>>>>
>>>> Eric Prud'hommeaux wrote:
>>>>
>>>>> persuant to
>>>>>  ACTION: ericP to draft text about a LOOSE keyword and run it by w3
>>>>>  folks to see if we're abusing the "at risk" mechanism.
>>>>> I drafted this section. It was slightly more awkward to not have an
>>>>> ALL to lean on, but I think this is pretty well defined:
>>>>>
>>>>> 9.4 INDISTINCT
>>>> When/where was this term introduced?
>>>>
>>>> If we decide to add this, I think I would actually prefer LOOSE:
>>>> INDISTINCT suggests (to me at least) that it is the opposite of DISTINCT
>>>> (which it is not; it would even be acceptable to have the same behavior
>>>> as DISTINCT).
>> I agree with Jeen.  LOOSE is better.
>>
>> INDETERMINATE is a bit long :-)
> 
> and LOOSE it is

In the telecon, we went for "REDUCED" didn't we?

http://www.w3.org/2007/03/20-dawg-minutes.html#action11

> 
>>>>> While the DISTINCT modifier ensures that duplicate solutions are
>>>>> eliminated from the solution set, INDISTINCT simply permits them to be
>>>>> eliminated. The cardinality of any set of variable bindings (solution)
>>>>> in an INDISTINCT solution set at least one and not more than the
>>>> ...*is* at least one...
>>> noted
>>>
>>>>> cardinality of the solution set with no DISTINCT or INDISTINCT
>>>>> modifier.
>>>> Perhaps better formulation would be to refer to the cardinality of the
>>>> solution set as prescribed by the algebra.
>>>>
>>>>> For example, the query
>>>>>
>>>>>  PREFIX foaf:    <http://xmlns.com/foaf/0.1/>
>>>>>  SELECT INDISTINCT ?name WHERE { ?x foaf:name ?name }
>>>>>
>>>>> may have one, two (shown here) or three solutions:
>>>>>  name
>>>>>  "Alice"
>>>>>  "Alice"
>>>> Of course, this only holds for a dataset which holds at least three
>>>> solutions for Alice, you might want to make that more explicit in this
>>>> paragraph (referring back to the example dataset explicitly?).
>>> I think that in context with the DISTINCT proposal, it's clear. See
>>> the attached HTML and tell me if you agree.
>> A thought: as DISTINCT and LOOSE are mutually exclusive, maybe 9.3.1 and 
>> 9.3.2 would be a better arrangement.  That would mean that the first part 
>> of 9.3, which is about queries without DISTINCT or LOOSE can go in 9.3 
>> intro.
>>
>> (Following that through, OFFSET and LIMT could go together in a "slice" 
>> section - editorial - for CR?)
> 
> good idea. will do after LC publication
> 
>> I suggest adding text to say that queries with LOOSE may behave differently 
>> across different implementations.  I don't know how to say that in the 
>> conformance language which does not use "implementation".
>>
>> And presumably a LOOSE query may differ across requests of the same query 
>> at the same service, given the motivating example we have been using.
> 
> I've gone a small step further and said that LOOSE slicing may behave
> differently across different query executions.
> 
> [[
> Note that queries where LOOSE semantics have changed the cardinality
> may not be consistently sliced by the LIMIT and OFFSET modifiers.
> ]]

Better would to say "queries using the REDUCED keyword may not be consistently 
sliced by the LIMIT and OFFSET modifiers." and not invoke anything

Better still would be to say nothing about slicing. There is no *promise* 
about slicing across two different query requests anyway because the data may 
change.

> a bit informal, but good enough until an elegent wording, i think
> 
>> It was suggested that there would be "at risk" text.  This should include 
>> mention that queries using LOOSE will not necessarily extend to aggregate 
>> functions.
> 
> pink text in 1.61:
> [[
> The LOOSE feature is at risk. Queries where LOOSE semantics have
> changed the cardinality will not have any counting semantics and will
> therefor not be useful with aggregate functions added to SPARQL or
> performed on SPARQL result sets..
> ]]

Please not not make it conditional on whether the REDUCED has changed the 
cardinality of not.  Just if REDUCED used at all.  (It's also not true if it 
the reduction is deterministic like DISTINCT).

	Andy
Received on Thursday, 22 March 2007 18:24:34 UTC