Re: Comments about the semantics of property paths from Andy Seaborne on 2012-02-06 (public-rdf-dawg@w3.org from January to March 2012)

From: Andy Seaborne <andy.seaborne@epimorphics.com>
Date: Mon, 06 Feb 2012 21:52:34 +0000
To: public-rdf-dawg@w3.org
Message-ID: <4F304BA2.8000107@epimorphics.com>

On 02/02/12 21:01, Axel Polleres wrote:
> To get the discussion going, my personal opinion on that one is as follows:
>
>   a) I also think this is new and relevant information.
>   b) I think it would be important to point to the fact that a naive implementation
>      of property paths may become very inefficient/blowing up on even relatively harmelsssly looking examples, and that Property PATHs
>      wrapped into DISTINCT subqueries can be evaluated more efficiently ... I'd be even more than happy
>      to point in an informal reference to their work, however I feel honestly very uncomfortable with their
>      title.
>   c) As for their conclusion, proposing a default semantics that uses distinct paths semantics,
>      whereas a separate keyword ALL-PATHS would be indicating the current semantics:
>      it seems that (looking at their results in Section 7.1) that this is orthogonally possible with our current semantics
>      by just wrapping any TriplesSameSubjectPath containing a property path into a DISTINCT subquery.
>      It seems their result in section 7.1 indicates something along these lines, but I need some help there:
>      admittedly don't have a formal proof for this equivalence yet (to be cautious, I am not yet 100% clear how/whether there is
>      any interference possible with bnodes within TriplesSameSubjectPath and duplicates coming from those bnodes)
>
> This all said, I unfortunately haven't had the time yet to check all their claims in all detail.
>
> Axel

I agree with your comment about the title.  To me, SPARQL 1.1 adds 
aggregation and grouping, and SPARQL Update, as the important features, 
more important than property paths.  In the extreme - if property paths 
don't find acceptance I think the evidence is already in that SPARQL 1.1 
is being adopted; aggregation and grouping are drivers (as is SPARQL 
Update) IMO.  This reflects a difference of approaches - whether use 
case driven or theory driven.  We stated with a features and 
requirements analysis.

The original property path design involved unique solutions (it had 
various problems) but even then the WG was concerned with the use of 
property paths to meet cases where duplicates matter.  There was also 
the matter of not obstructing path lengths.

In our reply last time, we illustrated how duplicates are important in 
the presence of aggregation.

We also pointed to the use of DISTINCT to remove duplicates by sub-query 
[1].

Let's start this discussion by assuming that there are expensive 
path/data combinations and deciding if that is a major concern.

On their investigation of ARQ - graph with large cliques are not 
something I've put any time into.  Uses cases are typically for short 
chains, and dealing with variability of connections.

The order/component*/item example we gave last time is important - 
having aggregation work intuitively is something I think we need to 
weight highly.

     Andy

[1] 
http://lists.w3.org/Archives/Public/public-rdf-dawg-comments/2011Feb/0005.html

Received on Monday, 6 February 2012 21:55:39 UTC