Re: [TF-PP] Scoping the design space from Paul Gearon on 2009-09-28 (public-rdf-dawg@w3.org from July to September 2009)

From: Paul Gearon <gearon@ieee.org>
Date: Mon, 28 Sep 2009 12:15:25 -0400
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <a25ac1f0909280915v4767560eh9f0df8675e6d6756@mail.gmail.com>
On Fri, Sep 25, 2009 at 10:35 AM, Seaborne, Andy <andy.seaborne@hp.com> wrote:
<snip/>
> The syntax (which ARQ does not implement) "?s ?p* ?o" could have two meanings : it must the same predicate that is being transitive or it could be a different predicate at each path step (connectivity finding).  I have no thoughts on how results from the latter would be handled practically.
>
> If we restrict to IRIs (no variables in path slot) for now, we haven't precluded adding them later but you can't write something that fins the property first:
>
> { ?p a :transitiveProperty
>  ?s ?p+ ?o
> }
>
>        Andy

This is an important use case, given that OWL lets you describe
transitivity as a property on a property.

Languages, like OWL, that let you describe transitivity on a property
let you write a single entailment rule that handles all of these
properties. Other systems require a rule per transitive property. Rule
systems can have some poor complexity issues that degrade
significantly with the number of rules, so having a single rule for
transitivity is a big bonus for OWL.

That said, you don't need a transitive predicate operation to make a
rule for transitivity work. You can just do something like:
  CONSTRUCT { ?s ?p ?o }
  WHERE { ?s ?p ?x . ?x ?p ?o . ?p a :transitiveProperty }

However, a rule built with the same operations as this query would run
O(log(n)) times.

Alternatively, a rule built around:
  CONSTRUCT { ?s ?p ?o }
  WHERE { ?s ?p+ ?x . ?p a :transitiveProperty }

will run much less often, having a flow-on effect to the other rules
that are triggered by the outcome of this one.

So that's my way of saying that I like variables for this operation.
However, we may need restrictions. Luke had a good example with the
use of an unbound predicate:

> Naively (ignoring complexity problems), allowing ?p to take any value seems to produce more
> interesting use cases, especially in untidy real world data, where people might use different
> predicates interchangably, e.g:
>
> SELECT ?s ?ob1 ?ob2 ?p WHERE {
>      ?s ?p* ?y .
>      FILTER(?p = <:somePredicate> || ?p =<:someSimilarPredicate> ) .
> }

Filtering is not binding, though it may often appear the have the same
effect (ignoring performance). This is a case that shows the
difference though.

Does {?s ?p* ?o} bind every individual predicate transitively, or does
it treat all predicates as a group and search through the group
transitively? As an example, consider the following family:

:fred : hasAncestor :mary
:mary : hasAncestor :anne
:mary :hasSibling :peter
:peter :hasSibling :harry

If we treat each predicate individually, then we'd get the following bindings:
?s    ?p     ?o
:fred :hasAncestor :mary
:mary :hasAncestor :anne
:fred : hasAncestor :anne    <-- new
:mary :hasSibling :peter
:peter :hasSibling :harry
:mary :hasSibling :harry    <-- new

This could be useful for following many predicate chains at once,
though it doesn't offer significantly more functionality than doing a
union between the results of following each predicate individually.
Still, it's useful.

If we treat the predicates as a group that can be transitive, then we
might see something like the following binding:

?s    ?p     ?o
:fred : hasAncestor :mary
:mary : hasAncestor :anne
:mary :hasSibling :peter
:peter :hasSibling :harry
:fred :hasAncestor :anne    <-- new
:mary :hasSibling :harry    <-- new
:fred :hasAncestor/:hasSibling :peter    <-- new
:fred :hasAncestor/:hasSibling :harry    <-- new

(So what does the double binding of ?p to :hasAncestor/:hasSibling
mean? If ?p isn't being selected, then it won't matter. Otherwise it
could be expressed as a pair of statements - one for each option on
the predicate binding.)

This could be useful for following multiple predicates that can be
used for expressing a super predicate. In this case a predicate like
:isRelatedTo can be determined based on its subpredicates of
:hasAncestor and :hasSibling. It would also be useful for traversing
groups of predicates like {:hasMother, :hasFather} to figure out
:hasAncestor.

I see valid use cases for both approaches, and difficulties with each
as well. I suppose the meaning of ?p* needs to be discussed, or else
we just disallow ?p* unless it gets bound elsewhere in the query (such
as with {?p a owl:TransitivePredicate}.

Regards,
Paul

P.S. in an earlier email I mistyped, and said our transitive
predicates costs Mulgara log(n) when I should have said n.log(n). But
you all knew what I meant, right?  :-)
Received on Monday, 28 September 2009 16:16:01 UTC