Re: [TF-PP] Property paths and entailments from Birte Glimm on 2010-05-28 (public-rdf-dawg@w3.org from April to June 2010)

From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
Date: Fri, 28 May 2010 14:57:49 +0100
To: Ivan Herman <ivan@w3.org>
Cc: W3C SPARQL WG <public-rdf-dawg@w3.org>
Message-ID: <AANLkTil8NXHM5mbfOyIN7qkApuml_0hfPpiMYYQM6BFm@mail.gmail.com>
On 28 May 2010 09:13, Ivan Herman <ivan@w3.org> wrote:
> (Lee asked to have different threads to different issues, so here I am)
>
> The question was:
>
> [[[
> + How do property paths interact with entailment? Discussion is needed with the members of the WG most swapped in with the entailment work.
> ]]]
>
> My (mental) model has always been that the entailment regime (in principle) expands all graphs to new graphs that include the original graph plus all possible extra triples that the entailment regime produces, and the query itself is performed on this expanded graph. If this model is right, then the natural way of looking at this is that the property path expansions are performed on the entailed graph.

That is one way of implementing a system that supports, e.g., RDFS,
entailment. Another approach to implement an an RDFS entailment system
is to leave the original data unchanged, but rewrite each the query,
so that the rewritten query contains regular expressions that navigate
over the queried graph in order to retrieve what would be answers
under RDFS entailment. This is also not unreasonable since the closure
of the graph can easily be double the size, so it takes longer to
evaluate queries over that and on the other hand BGPs from queries are
usually small, so rewriting is fast and although evaluating BGPs with
regular expressions is more expensive, you can do that over the
original graph and not over the blown up closure graph. The Chilenians
have, for example, a paper describing such an approach
(http://www.dcc.uchile.cl/~cgutierr/papers/nSPARQL.pdf).

Now I am not sure whether the suggested property path features are
expressive enough, but suppose they were, then if there is an
implementation that supports property paths, then I could use that
system to implementing the RDFS entailment regime simply by writing a
wrapper around that system that takes plain SPARQL queries, rewrites
them so that they use property path feature to reflect the
entailments, and then use the PP-aware system to answer those
rewritten queries over just the original graph, but what I would get
is answers as under RDFS entailment. E.g., query Q
SELECT ?type WHERE { ex:a a ?type }
Graph:
ex:a a ex:C.
ex:C rdfs:subClassOf ex:D.
ex:C rdfs:subClassOf ex:E.
Gives
?type/ex:C under standard SPARQL and additionally
?type/ex:D,?type/ex:E under RDFS entailment.
If you use the closure approach you would extend G to also contain
ex:a a ex:C.
ex:C rdfs:subClassOf ex:D.
ex:a a ex:D.
ex:a a ex:E.
If you use the rewriting technique, you would leave G as it is and
rewrite Q into
SELECT ?type WHERE { ex:a a ?freshVar . ?freshVar rdfs:subClassOf* ?type }
The * for rdfs:subClassOf means that you map ex:a a ?freshVar to G and
then you can go 0 or more steps over an rdfs:SubClassOf path to what
gives you the binding for ?type and 0 steps gives ex:C, 1 step ex:D,
and 2 gives ex:E.

The entailment regimes doc leaves it open how the regime is
implemented as long as you get the right answers and both approaches
are possible.

Another question is what happens if the originally asked query already
contains PP expressions. That is not so clear to me, but mostly due to
the * operator. E.g., in the BGP { ?sub rdfs:subClassOf* ?super }, the
* is simply redundant if you use RDFS entailment, since
rdfs:subClassOf is anyway a transitive and reflexive relation. If you
use * on normal properties, it is not redundant of course. For
logics/formalisms with finite model property (RDF(S) and OWL 2
profiles weaker than DL) it should always be possible to check that
because I can just check the finite canonical model/closure. For OWL
DL/Full, however, it can be the case that there are only infinite
models and there is most likely more than one model. Normally, we know
that under OWL Direct Semantics it is safe to just look at finite
representations of the models. I would expect that it is theoretically
possible to evaluate also queries with * over finite representation,
but that is just a guess and I have no plans to implement that and
would very much hope that PP support is not required for conformance.

Looking at the other features (not *) of the document, it seems that
they can all equally be rewritten into standard SPARQL queries. Then,
results under entailments would just differ because you might get more
answers as usual.

Birte



> Do I miss some major issue here?
>
> Ivan
>
> P.S. Sorry I could not join the telco, but I had another WG call (RDFa) exactly at the same time...
>
>
> ----
> Ivan Herman, W3C Semantic Web Activity Lead
> Home: http://www.w3.org/People/Ivan/
> mobile: +31-641044153
> PGP Key: http://www.ivan-herman.net/pgpkey.html
> FOAF: http://www.ivan-herman.net/foaf.rdf
>
>
>
>
>
>



-- 
Dr. Birte Glimm, Room 306
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283529
Received on Friday, 28 May 2010 13:58:24 UTC