Re: today's SPARQL agenda from Steve Harris on 2012-03-20 (public-rdf-dawg@w3.org from January to March 2012)

From: Steve Harris <steve.harris@garlik.com>
Date: Tue, 20 Mar 2012 14:42:55 +0000
To: birte.glimm@uni-ulm.de
Cc: Andy Seaborne <andy.seaborne@epimorphics.com>, public-rdf-dawg@w3.org
Message-Id: <41BB95EC-388E-4DDD-A805-4A3C7F802A9B@garlik.com>
That was how I guessed it worked, as I'd not seen any examples with DISTINCT() embedded in a path.

- Steve

On 2012-03-20, at 14:11, Birte Glimm wrote:

> Well, one could also say that DISTINCT can only be put around the
> complete path, e.g.,
> { ?X :a/DISTINCT(:c*)/:b*  ?Y } is invalid, but
> { ?X DISTINCT(:a/:c*/:b*)  ?Y } is ok
> 
> Birte
> 
> On 20 March 2012 15:09, Andy Seaborne <andy.seaborne@epimorphics.com> wrote:
>> 
>> 
>> On 20/03/12 13:48, Steve Harris wrote:
>>> 
>>> Though I haven't implemented it, so there's a lot of guesswork going on, I
>>> don't think I agree.
>>> 
>>> (2) adds extra syntax, and allows for a mixture of approaches in one path.
>> 
>> 
>> So does just DISTINCT(path) if I understand it:
>> e.g.
>> 
>> { ?X :a/DISTINCT(:c*)/:b*  ?Y }
>> 
>>        Andy
>> 
>> 
>> 
>>> My preferred implementation, if both were supported, would be to
>>> have
>> 
>> one counting, and one non-counting implementation and handoff the the
>> appropriate one - this generally makes optimisations easier. c.f.
>> DISTINCT/REDUCED v's not queries.
>>> 
>>> 
>>> - Steve
>>> 
>>> On 2012-03-20, at 10:48, Andy Seaborne wrote:
>>> 
>>>> Lee - Excellent summary.
>>>> 
>>>> A small point:
>>>> 
>>>> The implementation issues of (3) are as (2) because
>>>> DISTINCT(path*) is non-counting path*.  It's currently mentioned in the
>>>> editors working draft.  We could even define it that way but I have
>>>> currently singled out so we can help implementer by giving an algorithm
>>>> which does significantly better than distinct-of-counting-path.
>>>> 
>>>>        Andy
>>>> 
>>>> On 20/03/12 05:01, Lee Feigenbaum wrote:
>>>>> 
>>>>> Please note that the call today is still one hour earlier for folks not
>>>>> in the US.
>>>>> 
>>>>> Sorry for the late agenda.
>>>>> 
>>>>> Let's attack the property paths issue today and see if we have any
>>>>> consensus on how to proceed.
>>>>> 
>>>>> Our options that we need to choose from are (I've tried to summarize
>>>>> pros (+) and cons (-) below, but apologize if I've mis-represented or
>>>>> mis-characterized anything):
>>>>> 
>>>>> 1) Leave as is, no change
>>>>> + Does not require a new last call
>>>>> + Does not open up any potential _new_ issues from aspects of new
>>>>> design(s)
>>>>> - Almost surely results in a formal objection from the various
>>>>> commenters about the counting property path execution complexity issue
>>>>> - Potentially hamstrings some use cases of property paths, depending on
>>>>> whether all non-counting pp instances can be rewritten as SELECT
>>>>> DISTINCT subqueries
>>>>> 
>>>>> 
>>>>> 2) Add DISTINCT(path) and {+}/{*} operators
>>>>> + Addresses the commenters' concerns (as per informal discussions with
>>>>> some of them offlist)
>>>>> + Gives query authors significant expressivity in choosing the path
>>>>> counting semantics vs. performance tradeoff they want
>>>>> - Requires a new last call
>>>>> - Raises the burden to implement property paths
>>>>> - New design may have unknown interactions between counting and
>>>>> non-counting operators in the same path
>>>>> 
>>>>> 
>>>>> 3) Add DISTINCT(path) only
>>>>> + Addresses the commenters' concerns (as per informal discussions with
>>>>> some of them offlist)
>>>>> + Gives query authors some expressivity in choosing the path counting
>>>>> semantics vs. performance tradeoff they want
>>>>> - Requires a new last call
>>>>> - Raises the burden to implement property paths (but not as much as #2)
>>>>> 
>>>>> 
>>>>> 4) Add {+}/{*} counting/not-counting operators only
>>>>> + Gives query authors significant expressivity in choosing the path
>>>>> counting semantics vs. performance tradeoff they want
>>>>> - Requires a new last call
>>>>> - Raises the burden to implement property paths (but not as much as #2)
>>>>> - Likely results in a formal objection from the various commenters about
>>>>> the counting property path execution complexity issue (based on offlist
>>>>> discussion of this option)
>>>>> - New design may have unknown interactions between counting and
>>>>> non-counting operators in the same path
>>>>> 
>>>>> 5) Mark property paths as non-normative
>>>>> +/- Not sure if this requires a new last call
>>>>> + Lowers implementation burden
>>>>> - Removes a significant feature from SPARQL 1.1 Query
>>>>> - May lead to formal objections within the working group
>>>>> 
>>>>> Lee
>>>>> 
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Jun. Prof. Dr. Birte Glimm            Tel.:    +49 731 50 24125
> Inst. of Artificial Intelligence         Secr:  +49 731 50 24258
> University of Ulm                         Fax:   +49 731 50 24188
> D-89069 Ulm                               birte.glimm@uni-ulm.de
> Germany
> 

-- 
Steve Harris, CTO
Garlik, a part of Experian
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203  http://www.garlik.com/
Registered in England and Wales 0535 7233 VAT # 849 0517 11
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ
Received on Tuesday, 20 March 2012 14:43:40 UTC