Re: entailment review - part 1 from Birte Glimm on 2010-01-05 (public-rdf-dawg@w3.org from January to March 2010)

From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
Date: Tue, 5 Jan 2010 19:43:25 +0000
To: Axel Polleres <axel.polleres@deri.org>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <492f2b0b1001051143v6e64ca9cuac58aa4f22543782@mail.gmail.com>
Axel,
I copied your review into the email and comment on the mentioned
issues inline below, see also
http://www.w3.org/2009/sparql/docs/entailment/xmlspec.xml.
Thanks for the comments,
Birte

2010/1/5 Axel Polleres <axel.polleres@deri.org>:
> I am still catching up with Entailment and didn't complete the whole review in time...
> ... so, attached part 1, which essentially goes up to RDFS entailment only, didn't have
> time yet to take a closer look to the OWL parts.
>
> None of my comments should really hold up publication, I think the document is overall in
> very good shape.
>
> However, actually some of my questions (comment 6) and 9) ) refer to the definitions taken from the current
> query spec and might need further rethinking/discussion of those.
>
> Change summary:
>
> The first public working draft defined the semantics of SPARQL queries
> under RDF and RDFS entailment. In this public working draft the RDF
> and RDFS entailment regimes have been changed to use a skolemized
> version of the queried RDF triples to limit the possible answers to a
> finite set of answers. This prevents non-local effects that caused
> additional results for existing triples from unrelated newly added
> triples that contain new blank nodes. This draft further adds an
> entailment regime for OWL Direct Semantics, which covers the OWL 2 DL,
> EL, and QL Profiles.

Is that meant to suggest a heading for this part of the abstract? If
so, I added that.

>
> Detailed Comments:
>
> 1) (non-critical) Suggest to add a note in the Introduction: in
> future/final version of the document, only SPARQL 1.1/Query should be
> mentioned and no longer the "first SPARQL specification". SPARQL
> 1.1/Query also only talks about simple entailment, so all you say here
> about the old version will remain valid if you consistently only refer
> to the new SPARQL 1.1/Query document, I suppose.

I changed all SPARQL Query links now to point to the W3C Latest
Version link http://www.w3.org/TR/sparql11-query/. As soon as the next
WD is published that link will point to the then complete 1.1 spec. I
also changed the text where relevant to talk about the SPARQL Query
spec or SPARQL Query 1.1 spec only.

> 2) "two Turtle documents which differ only by re-naming their node
> identifiers will be understood to be equivalent."
>
> -I suggest to rather use "blank node identifier" than "node identifier"->
>
> "two Turtle documents which differ only by re-naming their blank node
> identifiers will be understood to be equivalent."

Done.

> 3)
> "In the case of any difference, the SPARQL Query Language definitions
> are the normative ones."
> -not sure, as I am not native, but I think that should be plural->
> "In the case of any differences, the SPARQL Query Language definitions
> are the normative ones."

Neither am I, but plural sounds correct, so I changed it.

> 4) "The term RDF-L denotes the set of all RDF Literals, RDF-B the set
> of all blank nodes in RDF graphs"
>
> Hmmm, why "in RDF graphs" and in which graphs? BTW, this question also
> applies to Query as well.

That is more a query comment. I just repeat the query definitions as I
also state above, to remind readers. If Andy and Steve want to change
that, I happily use a changed definition.

> 5) "The web ontology language OWL allows for even more inferences."
>
> As in other W3C docs, I suggest to consistently capitalise "Web".

Done.

> * Another remark which is not critical, but maybe we should re-discuss
> it at some point ... BGP extension says that entailment regimes must
> specify:
>  - well-formed graphs
>  - SG must be unitquely specified
>  - entailment relation
>  - finiteness condition for answers
>  - handling of inconsistent graphs
>
> It doesn't *actually* say that it should define restrict "which qeries
> are legal", does it? I anyway don't think that the definition of BGP
> extension
> does preclude such restrictions, but it isn't actually required by the
> original definition.

True. The closest to that is "An entailment regime specifies 1) ... 2)
an entailment relation between subsets of well-formed graphs and
well-formed graphs". and "2 -- For any basic graph pattern BGP and
pattern instance mapping P, P(BGP) is well-formed for E". I am not
sure whether I can interpret that as a possibility of defining what
legal/supported queries are. I think I once discussed that with Andy
and he suggested that all queries are legal, but some queries might
have empty answers. In particular for OWL Direct Semantics, I would
prefer to restrict not only the queried graphs but also the queries
themselves. If a query BGP cannot be parsed into ontology structures
then Direct Semantics entailment is just not defined. In that case I
would prefer to raise an error instead of giving an empty answer.
The other problem are update queries. Here we decided, I think, that
we put a note somewhere that the entailment regimes document does not
define the behaviour of systems for update queries. Once there is more
implementation experience one can then specify what implemented
systems do, which is most likely to use standard simple entailment for
update queries. I can add a note in this direction.

> 6) This remark might be overshooting (at leat for this WD), but:
>
> "The scoping graph, SG, corresponding to any consistent active graph
> AG is uniquely specified and is E-equivalent to AG."
> [...]
> "All entailment regimes specified here use the same definition of a
> scoping graph as given in SPARQL 1.0. Thus, the required equivalence
> is immediate."
>
> I am a bit worried that *actually* the definition of the scoping graph
> as given in SPARQL 1.0 is *NOT* uniquely specified, since it obviously
> doesn't
> uniquely determine the blank nodes. Not sure whether this is really an
> issue, but it seems a bit awkward.
>
> Maybe the condition should be weakened to something like
>
> "The scoping graph, SG, corresponding to any consistent active graph
> AG is uniquely (except blank node identifiers) specified and is
> E-equivalent to AG."
>
> Not really ideal either, but better than before?
> If we agree on that change, we can include that with a remark to ask
> for comments?

That is again a comment for Query and I agree it is a valid comment.
Several of the given conditions/definitions are not ideal IMO, that
being one of them. I would also prefer to use a skolemized scoping
graph directly, but that is also not possible, so I define this kind
of work around to meet the Query conditions. We further violate
already against the condition that the scoping graph must be
consistent according to the conditions in the Query spec, which we
cannot guarantee with the current RDFS entailment regime definition. I
would prefer to be more consistent, i.e., either remove the
consistency requirement everywhere or have it throughout.


> 7) "We explain these restrictions in greater detail in the following section."
> better:
> "We explain these restrictions in greater detail in the following sections."
>
> I.e., there are different restrictions for the different entailment
> regimes in the following sections.

Done.

> 8) "Thus, also the following solution mappings are possible solutions:
>
>   &mu;4 : s -> ex:a1, o -> _:c3,"
>
>  Is this solution really possible? doesn't it violate (at least along
> with &mu1;) condition 3. ?

It does violate condition 3 of query, but as I understand it, I have
to make sure the the entailment regimes satisfy the conditions given
in the Query spec. If I instantiate the BGP with that solution
mapping, then the triple is RDF entailed. All solutions that lead to
entailed triples are called possible solutions, but, the entailment
relation alone does not guarantee that the conditions of the Query
spec are met. I have to have extra conditions that make sure that 3
holds, which C1 does. At some point, I have to add a proof that the
conditions C1 and C2 guarantee the conditions given in the Query spec.
Simple entailment uses the subgraph criterion to meet this
restriction, but that wouldn't work well with inferences.

> 9) Can we add an intuitive explanation, why for (C2) the restriction
> is made to subjects? Im mean, yes it is sufficient to achieve
> finiteness. Looking back through the mails, I see there was some
> discussion to change this restriction, cf.
> http://lists.w3.org/Archives/Public/public-rdf-dawg/2009JulSep/0395.html
> but what actually speaks against just reformulating even stricter to
> the vocabulary of the scoping graph only, i.e.
>
>  "(C2) Each s,p,o in a triple ( s, p, o ) in P(BGP) occurs in the
> scoping graph."
>
> ?
>
> That would be the first of the alternative choices in the Editorial note:
> "Each subject s of a triple ( s, p, o ) in P(BGP) occurs in the scoping graph."
> right?

I just thought to be as least restrictive as possible. I added this to
the editorial note now together with an example of what you loose
because as you say it is a possibility.

> 10) the previous comment likewise apply to the RDFS Entailment Regime.
>

I changed the first sentence in the editorial note to say that all the
choices also apply to the RDFS regime that is described next. I
wouldn't like to repeat them and they can all equally be used for RDFS
too.
Received on Tuesday, 5 January 2010 19:44:00 UTC