Property path summary and ways forward

1/ There are two use cases for arbitrary length paths
UC1: Finding or testing for connectivity of nodes: e.g. rdfs:subClassOf*
UC2: Coping with variable structures and aggregation


2/ Property paths can be expensive, notably in UC2.

3/ We want to leave open the possibility of path lengths

(foaf:knows* is not a clear cut example - your LinkedIn graph is not 
simple connectivity)

4/ PP transforms are used to define property path evaluation
leaving only ZeroLengthPath, ZeroOrMorePath, OneOrMorePath, and 
NegatedPropertySet to be defined in the evaluation semantics.

-----

DISTINCT is one way forward.

?x DISTINCT(path) ?y .

is the pairs of (?x,?y) connected by path.

?x DISTINCT(path*) ?y is connectivity.
?x path* ?y is counting as per current spec.

-----

A minimal change is a pair of new operators {*} and {+} which are the 
analogous to * and +.

{*} and {+} are counting
* and + are not (a change to 2LC).

(This is based on guessing {?length} is a future path length syntax - 
see also discussion of {N,M}

This is less than ?x DISTINCT(path) ?y
It can be "as well as".

Roughly: the distinct * and + are achieved by changing ALP to remove the 
last line "remove x from V".  We need to check the edge cases still.

It's this state management operation that controls whether the test for 
termination is along the current path (counting) or whether the function 
has visited the node at all.

-----
For {*}:

{N,M} is untouched.
{0,*} is defined to be equivalent to {*} (i.e. still counting)

{N,M} can be very expensive (think {0,N} in a clique of N nodes) but it 
is syntax for ways to write the pattern as it would be in SPARQL 1.0

?X:p{1,M} ?Y is

   ?X :p ?Y .
UNION
   ?x :p ?a1 . ?a1 :p Y
UNION
   ?x :p ?a1 . ?a1 :p ?a2 . ?a2 :p ?Y
...
   ?x :p ?a1 . ?a1 :p ?a2 . ?a(M-1) :p ?Y

 Andy

Received on Monday, 27 February 2012 21:50:24 UTC