PP paths semantics

Hi Andy, others,
here's a suggestion for defining property paths. Why not define the
semantics of the various constructors in terms of standard SPARQL
operations? As far as I can see there are two constructors that cannot
be expressed by means of standard SPARQL, which are paths between two
nodes of length 0 and paths of arbitrary length, but  greater than
zero. Others you call complex are translated to union queries, which I
thought is legal, but maybe I am wrong.

Let T be a term (variable, IRI, bnode, or literal), iri an IRI, ?x a
variable not occurring elsewhere in the query, n, m natural numbers
(>0). All possibly with subscript, e.g., T_1 is a term too. A basic
property path is iri, ^iri, iri+, iri*, iri{n}, iri{n,m}, iri{n,},
iri{,m}. A property path is defined recursively as follows a basic
property path is a property path. For p, p_1, and p_2 property paths,
p_1/p_2 and ^p are property paths. We define a function simplify that
translates a triple that uses a property path as predicate into a
group graph pattern.

triple                   | simplify(triple)
T_1 iri T_2          | T_1 iri T_2
T_1 ^iri T_2         | T_2 iri T_1
*** T_1 iri+ T_2    | { T_1 iri+ T_2 } ***
T_1 iri* T_2         | { T_1 iri{0} T_2 } UNION { T_1 iri+ T_2 }
*** T_1 iri{0} T_2  | {} FILTER(T1 == T2) <- not working ***
T_1 iri{n} T_2       | { T_1 iri ?x_1 . ?x_1 iri ?x2 . ... ?x_{n-1} iri T_2 }
T_1 iri{n,m} T_2   | { { T_1 iri{n} T_2 } UNION { T_1 iri{n+1} T_2 }
UNION ... { T_1 iri{m} T_2 } }
T_1 iri{n,} T_2      | { { T_1 iri{n} T_2 } UNION { T_1 iri* T_2 } }
T_1 iri{,n} T_2      | { { T_1 iri{0} T_2 } UNION { T_1 iri{1,2} T_2 } }
T_1 p_1/p_2 T_2  | { simplify(T_1 p_1 ?x) . simplify(?x p_2 T_2) }
T_1 ^p T_2          | { simplify(T_2 p T_1) }

(not sure my ascii table will come across nicely formatted...)

The 2 things marked with *** are, if I am not mistaken, not
expressible in standard SPARQL and my attempt of translating T_1
iri{0} T_2 is not working as I understand it because { } leaves T_1
and T_2 unbound if they are variables and that would evaluate to
false, which is not what is intended.

Implementations are of course free to evaluate the expressions as they
feel fit, as long as the results are the same. I am not suggesting you
should actually union all the expanded queries together in T_1
iri{n,m} T_2. This is really just to get the semantics straight. This
would at least only leave two cases to be defined.

Just a suggestion and maybe my use of unions is violating some
constraints that I am not aware of...
Birte




-- 
Dr. Birte Glimm, Room 306
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283529

Received on Friday, 28 May 2010 16:44:45 UTC