From: Steve Harris <steve.harris@garlik.com>

Date: Fri, 2 Mar 2012 14:56:25 +0000

Cc: SPARQL Working Group <public-rdf-dawg@w3.org>

Message-Id: <F3C9C7FE-D517-4C61-A72F-B1E04D09AAD3@garlik.com>

To: Andy Seaborne <andy.seaborne@epimorphics.com>

Date: Fri, 2 Mar 2012 14:56:25 +0000

Cc: SPARQL Working Group <public-rdf-dawg@w3.org>

Message-Id: <F3C9C7FE-D517-4C61-A72F-B1E04D09AAD3@garlik.com>

To: Andy Seaborne <andy.seaborne@epimorphics.com>

I have a concern that this approach will increase the barrier to implementation by too much. It's already hard enough that a couple of systems (4store and Redland for e.g.) are unlikely to implement PPs, even in the medium term. This may mean even more systems never end up with a complete SPARQL 1.1 implementation. Not necessarily a critical problem, but worth noting. - Steve On 2012-03-02, at 13:59, Andy Seaborne wrote: > This message gives the general outline of changes for property paths - it's not detail-complete. > > The next step should be tests for nasty corner cases. > > == Defining *, +, {*} and {+} > > The algebra operator will be: > > ZeroOrMorePath1 > OneOrMorePath1 > ZeroOrMorePathN > OneOrMorePathN > > The current * and + become {*} and {+} which use > 1 ALP(x:term, path, R:multiset of RDF terms , V:set of RDF terms) = > 2 if ( x in V ) return > 3 add x to V > 4 add x to R > 5 X = eval(x,path) > 6 For n:term in X > 7 ALP(n, path, R, V) > 8 End > 9 remove x from V > > V is the set of nodes considered visited. > A node is added to V on entry and removed on exit. > > So V is the set of nodes in the current path, not the set of nodes visited through the recursive application of ALP. > > Simply removing line 9 changes the function to track all nodes visited. > You don't need R and V then, so remove V and line 3: > > 1 ALP1(x:term, path, R:set of RDF terms) = > 2 if ( x in R ) return > 3 add x to R > 4 X = eval(x,path) > 5 For n:term in X > 6 ALP1(n, path, R) > 7 End > > The entry step to ALP1 for OneOrMorePath1 needs aligning. > > The nodes recorded are just those the subpath ends on. So if the subpath is :p/:q then the overall path can loop back through the gap between a :p and :q. This makes sense to me - its working over a (non-RDF) graph induced by only considering :p/:q as links. > > == Spec'ing DISTINCT(path) > > The spec'ing issue is whether defining DISTINCT(path) in terms of a SELECT DISTINCT sub-query is sufficient or should we explain what distinct'ness means for each of the graph operators and provide an algorithm for each. > > Some operators are naturally distinct anyway (e.g. a reverse path something distinct is distinct). > > The non-fundamental path operators are: > > X ^path Y > X path1 / path2 Y > X path1 | path2 Y > X path{n} Y > X path{n,m} Y > X path{n,} Y > X path{0,} Y > X path{,n} Y > > The way to define as a re-write is e.g. > ?x :p/DISTINCT(path)/:q ?y > ==> > ?x :p ?_v1 . > SELECT DISTINCT ?_v1 ?_v2 { ?_v1 path ?_v2 } > ?_v2 :q ?y . > > This could even be done for * and + based on {*} and {+} but I think spelling out an algorithm is much more helpful. > > The alternative is a table defining DISTINCT(path) for path operator. > > Andy > > -- Steve Harris, CTO, Garlik Limited 1-3 Halford Road, Richmond, TW10 6AW, UK +44 20 8439 8203 http://www.garlik.com/ Registered in England and Wales 0535 7233 VAT # 849 0517 11 Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZReceived on Friday, 2 March 2012 14:57:14 GMT

*
This archive was generated by hypermail 2.3.1
: Tuesday, 26 March 2013 16:15:47 GMT
*