Re: entailment review - part 2 from Birte Glimm on 2010-01-11 (public-rdf-dawg@w3.org from January to March 2010)

From: Birte Glimm <birte.glimm@comlab.ox.ac.uk>
Date: Mon, 11 Jan 2010 17:02:11 +0000
To: Axel Polleres <axel.polleres@deri.org>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-ID: <492f2b0b1001110902g78d01145u90ea908c86932f7c@mail.gmail.com>
see inline comments below and
http://www.w3.org/2009/sparql/docs/entailment/xmlspec.xml for the
up-to-date version...
Birte

2010/1/9 Axel Polleres <axel.polleres@deri.org>:
> Hi Birte, all,
>
> thanks for your efforts and careful consideration of the comments so far!
>
> here comes the second part. None of them would hold up publication, most of them are only to be kept in mind for the next version.
>
> I start with those comments from part one which were still under discussion, I'd suggest to mark them with notes where appropriate.
>
> *)
> On 5 Jan 2010, at 19:43, Birte Glimm wrote:
>> > It doesn't *actually* say that it should define restrict "which qeries
>> > are legal", does it? I anyway don't think that the definition of BGP
>> > extension
>> > does preclude such restrictions, but it isn't actually required by the
>> > original definition.
>>
>> True. The closest to that is "An entailment regime specifies 1) ... 2)
>> an entailment relation between subsets of well-formed graphs and
>> well-formed graphs". and "2 -- For any basic graph pattern BGP and
>> pattern instance mapping P, P(BGP) is well-formed for E". I am not
>> sure whether I can interpret that as a possibility of defining what
>> legal/supported queries are. I think I once discussed that with Andy
>> and he suggested that all queries are legal, but some queries might
>> have empty answers. In particular for OWL Direct Semantics, I would
>> prefer to restrict not only the queried graphs but also the queries
>> themselves. If a query BGP cannot be parsed into ontology structures
>> then Direct Semantics entailment is just not defined. In that case I
>> would prefer to raise an error instead of giving an empty answer.
>> The other problem are update queries. Here we decided, I think, that
>> we put a note somewhere that the entailment regimes document does not
>> define the behaviour of systems for update queries. Once there is more
>> implementation experience one can then specify what implemented
>> systems do, which is most likely to use standard simple entailment for
>> update queries. I can add a note in this direction.
>
> I would actually prefer to have notes in for both these aspects.

The note on updates is now in an informative section at the end of the
document. The is an editorial note now before Section 6.2 (after
explaining the current choice of restrictions on answers) that
explains alternatives. Within that note the above issue is addressed.

>> > 6) This remark might be overshooting (at leat for this WD), but:
>> >
>> > "The scoping graph, SG, corresponding to any consistent active graph
>> > AG is uniquely specified and is E-equivalent to AG."
>> > [...]
>> > "All entailment regimes specified here use the same definition of a
>> > scoping graph as given in SPARQL 1.0. Thus, the required equivalence
>> > is immediate."
>> >
>> > I am a bit worried that *actually* the definition of the scoping graph
>> > as given in SPARQL 1.0 is *NOT* uniquely specified, since it obviously
>> > doesn't
>> > uniquely determine the blank nodes. Not sure whether this is really an
>> > issue, but it seems a bit awkward.
>> >
>> > Maybe the condition should be weakened to something like
>> >
>> > "The scoping graph, SG, corresponding to any consistent active graph
>> > AG is uniquely (except blank node identifiers) specified and is
>> > E-equivalent to AG."
>>
>
> Hmmm, Andy replied here:
>
>> IIRC It's not supposed to uniquely specify it in the query spec but to give
>> the framework - the entailment regime should specify the scoping graph in a
>> compatible manner.
>
> I have to admit that I don't really understand that. My concern is that the framework
> says - in its current form - that uniqueness is required, but the only instantiation
> does not guarantee such uniqueness, strictly spoken. Anyways...

I added an editorial note now highlighting this issue to Section 1.3,
where it is discussed how the entailment regimes satisfy the
conditions from the Query spec. Even though the Query spec defined the
framework, the problem is that this is very restricting and we cannot
see how the given conditions can be satisfied. The current solution is
as good as it gets (i.e., specifying up to bnode renaming) and the
effect is what I assume the authors of that framework had in mind.

> [...]
>> That is again a comment for Query and I agree it is a valid comment.
>> Several of the given conditions/definitions are not ideal IMO, that
>> being one of them. I would also prefer to use a skolemized scoping
>> graph directly, but that is also not possible, so I define this kind
>> of work around to meet the Query conditions. We further violate
>> already against the condition that the scoping graph must be
>> consistent according to the conditions in the Query spec,
>
> I am not sure, actually, condition 1. doesn't require consistency of SG, it only says:
> "The scoping graph, SG, corresponding to any consistent active graph AG is uniquely specified and is E-equivalent to AG."
>
> So, hmmmm, *actually*, this wording actually doesn't limit at all what the scoping graph to an
> inconsistent graph is: In fact, this even seems to let open that the SG for an inconsistent
> graph is e.g. empty, implementation dependent, etc.

Well, if the active graph is inconsistent and the scoping graph is
E-equivalent to an inconsistent graph, then also the scoping graph
must be inconsistent. I don't see how a consistent graph can be
E-equivalence to an inconsistent one otherwise.

> Still, the issue remains how to proceed with inconsistent graphs, since the behavior
> has to be specified for each extension:
>
>  "The effect of a query on an inconsistent graph is not covered by this specification,
>  but must be specified by the particular SPARQL extension."
>
> My (Axel- chairthatoff) interpretation of this is that I don't see that this implies that an
> extension has to *uniquely* define the behavior on
> inconsistent graphs, actually it could leave several options implementation-dependent (i.e. different
> for implementations that do or don't perform consistency checking.) So, the currently seemingly
> suggested path seems to be fine:
>  - implementations that don't do inconsistency checking can construct SG as for a consistent graph
>  - consistency-checking implementations should throw an error
>
> That is, I think the wording in the editors note is too strong:
>
> "[...] explicitly mentions that the scoping graph must be E-equivalent (RDFS equivalent in this case)
>  to the active graph and that AG must be consistent. "

Hm, I disagree. See my arguments above.

>> which we
>> cannot guarantee with the current RDFS entailment regime definition. I
>> would prefer to be more consistent, i.e., either remove the
>> consistency requirement everywhere or have it throughout.
>
> ... indeed probably we have to collect these issues which look rather like errata
> to the query spec in opening issues. I don't intend to holding up the current drafts,
> please decide whether you want to add a note here. I will try to summarise the
> critical points in a separate mail.
>
> *)
>>
>> > 8) "Thus, also the following solution mappings are possible solutions:
>> >
>> >   &mu;4 : s -> ex:a1, o -> _:c3,"
>> >
>> >  Is this solution really possible? doesn't it violate (at least along
>> > with &mu1;) condition 3. ?
>>
>> It does violate condition 3 of query, but as I understand it, I have
>> to make sure the the entailment regimes satisfy the conditions given
>> in the Query spec. If I instantiate the BGP with that solution
>> mapping, then the triple is RDF entailed. All solutions that lead to
>> entailed triples are called possible solutions, but, the entailment
>> relation alone does not guarantee that the conditions of the Query
>> spec are met. I have to have extra conditions that make sure that 3
>> holds, which C1 does. At some point, I have to add a proof that the
>> conditions C1 and C2 guarantee the conditions given in the Query spec.
>> Simple entailment uses the subgraph criterion to meet this
>> restriction, but that wouldn't work well with inferences.
>
> My "concern" would be remedied, if you'd replace:
>
> "Clearly, the set of possible solutions is infinite in this case, but for a possible
> solution to actually be a solution, the two conditions C1 and C2 have to be met:"
> -->
> "Observe, however, that for instance solution &mu;4 violates condition 3.
> from above and in fact clearly, the set of possible solutions is infinite in
> this case which is problematic with respect to condition 4. So, for a possible
> solution to actually be a solution, the two additional conditions C1 and C2
> have to be met:"

Done.


> Now for the OWL part:
>
>
> 1)
> "In the latter case, the system can be incomplete."
>
> I am not sure what "the system" is... Do you mean:
> "In the latter case, the answers can be incomplete."
>
> I think this remark is anyways redundant. In fact, it has to be handled with care:
> Note that, by the nonmonotonicity of NEGATION and OPTIONAL, such "incompleteness"
> can easily accumulate to "overcompleteness" thatintuitively would be understood
> as unsound answers. I'd suggest to drop that sentence...

Good point. Dropped the sentence.

>
> 2)
> "If the queried ontology is inconsistent under OWL 2 Direct Semantics, the system must raise an error."
>
> It might appear strange that we do require consistency checking for OWL, but not for RDFS, at least in the final doc,
> that might need some explaining remarks.

Yes. All systems I am aware of, do such consistency checks anyway.
Without them, you wouldn't pass the OWL WG tests and most OWL Direct
Semantics systems implement model building procedures, that
automatically notice inconsistencies. In that case you just can't
build a model. If systems do the check anyway, then I don't have to
worry about not satisfying the Query spec conditions and just require
it. I agree that it is not consistent and either needs an explanation
or we need a consistent solution.

I added an editorial note, pointing out this inconsistency between the
regimes and that gives this short explanation. At least this might
encourage readers to give us some feedback, so that we can see what
the community thinks about this.

> 3)
> the link behind "skolemization" points to:
> http://www.w3.org/TR/rdf-mt/#prf
> shouldn't it point rather to:
> http://www.w3.org/TR/rdf-mt/#glossSkolemization

Changed.

> BTW, Skolemization should be capitalized.
Done.

>
> 4)
> "(C3) no variable occurs in the object position of a triple with the predicate owl:minCardinality, owl:maxCardinality, owl:cardinality, owl:minQualifiedCardinality,owl:maxQualifiedCardinality, or owl:qualifiedCardinality."
>
> This is rather a condition on valid queries, than on the solution mapping... so this condition actually seems to restrict the allowed queries
> rather than the solution mappings, or no?

Well, such queries are not per se ill-formed for the regime. I can see
that we could exclude queries with BGPs that cannot be parsed into OWL
DL ontologies as we exclude RDF graphs that cannot be mapped to OWL DL
ontologies. Such BGPs can, however, be parsed into OWL 2 objects, but
they can cause infinite solutions.
One can, in some cases, move the restrictions on solutions to
restrictions on queries. The result is different though. If the query
is deemed ill-formed/illegal for the regime, then the system should
raise an error. If the condition on answers are not satisfied, the
query will just have an empty answer. At the moment, all SPARQL
Queries are valid for all regimes, but if it does not violate the
conditions on extensions for BGP matching, I might move some
conditions to define legal queries rather than legal solutions in the
next WD.

> 5)
> For (C5) similar to the restrictions for RDFS axiomatic triples another alternative could be to restrict to only those (literal) values appearing in the
> vocabulary of SG.

That is now discussed in an ed note before Section 6.2 as an
alternative design choice, highlighting the possible non-local
consequences that this has under OWL Direct Semantics.

> 6)
> I am not sure about Section 6.2 ... "The defined semantics allows for certain forms of higher order queries."
> What defined semantics do you mean? Actually the example says that such queries do not give results.

I rephrased that paragraph. The "defined semantics" refers to the
entailment regimes. That is hopefully clearer now. Variables can bind
not only to elements of the domain (data values or individuals), but
also to sets of the domain, i.e., to classes or properties. This is
beyond standard conjunctive queries. What is still not allowed is
variables in the position of quantifiers. I hope the rephrased
paragraph is clearer.


> 7)
> Section 6.3.1
>
> OWL 2 Dl fragment
> -->
> OWL 2 DL fragment

Done.

> 8)
> Section 6.3.2
>
> Endpoints that use the OWL 2 Direct Semantics entailment regime and that support the only the OWL 2 EL
> -->
> Endpoints that use the OWL 2 Direct Semantics entailment regime and that support only the OWL 2 EL

Done.

> 9) OWL specific comment to section 8:
>
> Wouldn't for the datasets a semantics that would take into account owl:imports make sense?
> i.e if the named graph owl:imports an ontology (graph) that is also in the named graphs, I
> would intuitively expect that the semantics of the imported ontology is taken into account.
> This does not seem to be foreseen in the current design, correct?

As I understand it, imports are taken into account. Import statements
in OWL are like instructions to the parser (they are non-logical),
which require that during loading also the imported ontologies are
loaded. Reasoning then happens with respect to the axiom closure, see
http://www.w3.org/TR/owl2-syntax/#Canonical_Parsing_of_OWL_2_Ontologies.
Thus, if my data set has an empty default graph, a named graph
<http://example.org/a> (I assume that's also the ontology IRI) and the
graph contains the triple <http://example.org/a> owl:imports
<http://example.org/b.rdf>, then the endpoint has to also load the
axioms from b.rdf. How exactly the system stores the triples is not
relevant. Reasoning must, however, happen also w.r.t. triples from
b.rdf.

Here's a complete example:

My data set is:

# Default graph
empty

# Named graph: http://example.org/a
  @prefix ex: <http://exmple.com/> .
  ex:a a owl:Ontology .
  ex:a owl:imports ex:b.rdf .
  ex:p rdfs:domain ex:A .

Ontology document with IRI <http://exmple.com/b.rdf> contains:
  @prefix ex: <http://exmple.com/> .
  ex:x ex:p ex:y .

If we ask the following query under OWL Direct Semantics entailment
  PREFIX ex: <http://example.org/>
  SELECT ?i FROM NAMED ex:a WHERE { GRAPH ex:a { ?i a ex:A } }

we get { (i, ex:x) }.

This is prescribed by the OWL 2 spec that defines how imports have to
be handled and that entailments are computed w.r.t. axiom closure of
the queried ontology/graph.

Birte


-- 
Dr. Birte Glimm, Room 306
Computing Laboratory
Parks Road
Oxford
OX1 3QD
United Kingdom
+44 (0)1865 283529
Received on Monday, 11 January 2010 17:02:47 UTC