- From: Steve Harris <steve.harris@garlik.com>
- Date: Fri, 2 Mar 2012 14:56:25 +0000
- To: Andy Seaborne <andy.seaborne@epimorphics.com>
- Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
I have a concern that this approach will increase the barrier to implementation by too much.
It's already hard enough that a couple of systems (4store and Redland for e.g.) are unlikely to implement PPs, even in the medium term. This may mean even more systems never end up with a complete SPARQL 1.1 implementation.
Not necessarily a critical problem, but worth noting.
- Steve
On 2012-03-02, at 13:59, Andy Seaborne wrote:
> This message gives the general outline of changes for property paths - it's not detail-complete.
>
> The next step should be tests for nasty corner cases.
>
> == Defining *, +, {*} and {+}
>
> The algebra operator will be:
>
> ZeroOrMorePath1
> OneOrMorePath1
> ZeroOrMorePathN
> OneOrMorePathN
>
> The current * and + become {*} and {+} which use
> 1 ALP(x:term, path, R:multiset of RDF terms , V:set of RDF terms) =
> 2 if ( x in V ) return
> 3 add x to V
> 4 add x to R
> 5 X = eval(x,path)
> 6 For n:term in X
> 7 ALP(n, path, R, V)
> 8 End
> 9 remove x from V
>
> V is the set of nodes considered visited.
> A node is added to V on entry and removed on exit.
>
> So V is the set of nodes in the current path, not the set of nodes visited through the recursive application of ALP.
>
> Simply removing line 9 changes the function to track all nodes visited.
> You don't need R and V then, so remove V and line 3:
>
> 1 ALP1(x:term, path, R:set of RDF terms) =
> 2 if ( x in R ) return
> 3 add x to R
> 4 X = eval(x,path)
> 5 For n:term in X
> 6 ALP1(n, path, R)
> 7 End
>
> The entry step to ALP1 for OneOrMorePath1 needs aligning.
>
> The nodes recorded are just those the subpath ends on. So if the subpath is :p/:q then the overall path can loop back through the gap between a :p and :q. This makes sense to me - its working over a (non-RDF) graph induced by only considering :p/:q as links.
>
> == Spec'ing DISTINCT(path)
>
> The spec'ing issue is whether defining DISTINCT(path) in terms of a SELECT DISTINCT sub-query is sufficient or should we explain what distinct'ness means for each of the graph operators and provide an algorithm for each.
>
> Some operators are naturally distinct anyway (e.g. a reverse path something distinct is distinct).
>
> The non-fundamental path operators are:
>
> X ^path Y
> X path1 / path2 Y
> X path1 | path2 Y
> X path{n} Y
> X path{n,m} Y
> X path{n,} Y
> X path{0,} Y
> X path{,n} Y
>
> The way to define as a re-write is e.g.
> ?x :p/DISTINCT(path)/:q ?y
> ==>
> ?x :p ?_v1 .
> SELECT DISTINCT ?_v1 ?_v2 { ?_v1 path ?_v2 }
> ?_v2 :q ?y .
>
> This could even be done for * and + based on {*} and {+} but I think spelling out an algorithm is much more helpful.
>
> The alternative is a table defining DISTINCT(path) for path operator.
>
> Andy
>
>
--
Steve Harris, CTO, Garlik Limited
1-3 Halford Road, Richmond, TW10 6AW, UK
+44 20 8439 8203 http://www.garlik.com/
Registered in England and Wales 0535 7233 VAT # 849 0517 11
Registered office: Landmark House, Experian Way, Nottingham, Notts, NG80 1ZZ
Received on Friday, 2 March 2012 14:57:14 UTC