Re: Property path summary and ways forward from Axel Polleres on 2012-02-27 (public-rdf-dawg@w3.org from January to March 2012)

From: Axel Polleres <axel.polleres@deri.org>
Date: Mon, 27 Feb 2012 23:16:05 +0100
To: Andy Seaborne <andy.seaborne@epimorphics.com>
Cc: "SPARQL Working Group" <public-rdf-dawg@w3.org>
Message-Id: <0D7A5BAC-4049-4BC7-9D14-747AEF8F1027@deri.org>

Hi Andy,

Two questions for clarification:

> A minimal change is a pair of new operators {*} and {+} which are the
> analogous to * and +.
> 
> {*} and {+} are counting
> * and + are not (a change to 2LC).


1) So, would e.g. :p{*}/:q* mix non-counting and counting?
2) Is this meant on top or alternatively to DISTINCT() ?

best,
Axel

On 27 Feb 2012, at 22:49, Andy Seaborne wrote:

> 1/ There are two use cases for arbitrary length paths
> UC1: Finding or testing for connectivity of nodes: e.g. rdfs:subClassOf*
> UC2: Coping with variable structures and aggregation
> 
> 
> 2/ Property paths can be expensive, notably in UC2.
> 
> 3/ We want to leave open the possibility of path lengths
> 
> (foaf:knows* is not a clear cut example - your LinkedIn graph is not
> simple connectivity)
> 
> 4/ PP transforms are used to define property path evaluation
> leaving only ZeroLengthPath, ZeroOrMorePath, OneOrMorePath, and
> NegatedPropertySet to be defined in the evaluation semantics.
> 
> -----
> 
> DISTINCT is one way forward.
> 
> ?x DISTINCT(path) ?y .
> 
> is the pairs of (?x,?y) connected by path.
> 
> ?x DISTINCT(path*) ?y is connectivity.
> ?x path* ?y is counting as per current spec.
> 
> -----
> 
> A minimal change is a pair of new operators {*} and {+} which are the
> analogous to * and +.
> 
> {*} and {+} are counting
> * and + are not (a change to 2LC).
> 
> (This is based on guessing {?length} is a future path length syntax -
> see also discussion of {N,M}
> 
> This is less than ?x DISTINCT(path) ?y
> It can be "as well as".
> 
> Roughly: the distinct * and + are achieved by changing ALP to remove the
> last line "remove x from V".  We need to check the edge cases still.
> 
> It's this state management operation that controls whether the test for
> termination is along the current path (counting) or whether the function
> has visited the node at all.
> 
> -----
> For {*}:
> 
> {N,M} is untouched.
> {0,*} is defined to be equivalent to {*} (i.e. still counting)
> 
> {N,M} can be very expensive (think {0,N} in a clique of N nodes) but it
> is syntax for ways to write the pattern as it would be in SPARQL 1.0
> 
> ?X:p{1,M} ?Y is
> 
>    ?X :p ?Y .
> UNION
>    ?x :p ?a1 . ?a1 :p Y
> UNION
>    ?x :p ?a1 . ?a1 :p ?a2 . ?a2 :p ?Y
> ...
>    ?x :p ?a1 . ?a1 :p ?a2 . ?a(M-1) :p ?Y
> 
>         Andy
> 
>

Received on Monday, 27 February 2012 22:16:38 UTC