# Defining property paths

```This message gives the general outline of changes for property paths -
it's not detail-complete.

The next step should be tests for nasty corner cases.

== Defining *, +, {*} and {+}

The algebra operator will be:

ZeroOrMorePath1
OneOrMorePath1
ZeroOrMorePathN
OneOrMorePathN

The current * and + become {*} and {+} which use
1  ALP(x:term, path, R:multiset of RDF terms , V:set of RDF terms) =
2     if ( x in V ) return
5     X = eval(x,path)
6     For n:term in X
7         ALP(n, path, R, V)
8         End
9     remove x from V

V is the set of nodes considered visited.
A node is added to V on entry and removed on exit.

So V is the set of nodes in the current path, not the set of nodes
visited through the recursive application of ALP.

Simply removing line 9 changes the function to track all nodes visited.
You don't need R and V then, so remove V and line 3:

1  ALP1(x:term, path, R:set of RDF terms) =
2     if ( x in R ) return
4     X = eval(x,path)
5     For n:term in X
6         ALP1(n, path, R)
7         End

The entry step to ALP1 for OneOrMorePath1 needs aligning.

The nodes recorded are just those the subpath ends on.  So if the
subpath is :p/:q then the overall path can loop back through the gap
between a :p and :q.  This makes sense to me - its working over a
(non-RDF) graph induced by only considering :p/:q as links.

== Spec'ing DISTINCT(path)

The spec'ing issue is whether defining DISTINCT(path) in terms of a
SELECT DISTINCT sub-query is sufficient or should we explain what
distinct'ness means for each of the graph operators and provide an
algorithm for each.

Some operators are naturally distinct anyway (e.g. a reverse path
something distinct is distinct).

The non-fundamental path operators are:

X ^path Y
X path1 / path2 Y
X path1 | path2 Y
X path{n} Y
X path{n,m} Y
X path{n,} Y
X path{0,} Y
X path{,n} Y

The way to define as a re-write is e.g.
?x :p/DISTINCT(path)/:q ?y
==>
?x :p ?_v1 .
SELECT DISTINCT ?_v1 ?_v2 { ?_v1 path ?_v2 }
?_v2 :q ?y .

This could even be done for * and + based on {*} and {+} but I think
spelling out an algorithm is much more helpful.

The alternative is a table defining DISTINCT(path) for path operator.

Andy
```

Received on Friday, 2 March 2012 14:00:13 UTC