Re: Proposed change to the OWL-2 Direct Semantics entailment regime from Enrico Franconi on 2010-11-30 (public-rdf-dawg@w3.org from October to December 2010)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Tue, 30 Nov 2010 15:20:07 +0100
To: SPARQL Working Group <public-rdf-dawg@w3.org>
Message-Id: <D08B66A2-CC14-4DB2-9F6E-42C28698378C@inf.unibz.it>
On 30 Nov 2010, at 13:16, Bijan Parsia wrote:

> On 30 Nov 2010, at 09:57, Enrico Franconi wrote:
> 
>> Currently, bnodes in a BGP bind only to individuals or other bnodes. According to the formal semantics (and the spirit) of RDF, bnodes represent existential (but unknown) information. If the information is encoded in RDF (or in other simple languages where existential information can always be uniquely and exactly materialised explicitly as bnodes, like RDFS and OWL-RL) then the existential meaning of bnodes in queries is correctly captured by binding them just to individuals or other bnodes. However, in OWL-2 Direct Semantics entailment regime a lot of existential information remain necessarily implicit, and we would weaken the notion of answer (wrt standard OWL-2 Direct Semantics entailment regime in 25 years of description logics literature) if we do not treat bnodes in BGPs properly.
> 
> I'd prefer a more neutral term, such as "treat bnodes in BGPs according to a straightforward extrapolation from the RDF semantics". I would argue, myself, that that isn't the proper way to treat them. Indeed, I would argue (and have a lot) that it's not the proper way to treat them in RDF graphs :)

uh? If you look at the model theory of RDF, being existentially bound is exactly the semantics of bnodes. Implementation-wise, we can bind bnodes to individuals or other bnodes just because RDF, RDFS, RDF-RL are "simple" languages and this implementation would be correct (and simple! and elegant!).

As a theoretician, if you think to classical entailment as inclusion between models, then you have no other choice than to agree with me. Consider a boolean query as a BGP (i.e., a RDF graph without variables at all, but just individuals and bnodes), and a RDF graph as data; the answer of the BGP query over the RDF data graph is true if and only if all the models of the RDF data graph are among the models of the BGP query -- and this with the *standard RDF* model theory (as in the PHayes document). This is the way queries are defined everywhere.
In my example:

DATA GRAPH G (the same as before):
A = R some Thing
B = R max 0 Thing

:john :friend :mary
:john friend :andrea
:mary :loves :andrea
:andrea :loves :paul
:mary rdf:type A
:paul rdf:type B

QUERY Q (now it is a boolean query without query variables):
{:john friend _:y, _:y rdf:type A, _:y loves _:z, _:z rdf:type B}

ANSWER: true,
because models(G) \subseteq models(Q)

>> So, my proposal would be - only in the case of the OWL Direct Semantics entailment regime - to allow (as an option -- say, with a flag -- or as an additional definition) for an extended meaning of bnodes in the query as proper existential nodes. In this case the answer to a query - when bnodes in the answer are filtered out - is monotonically richer (that is, any non-bnode element of the answer set in the original semantics is also in the answer set of this stronger semantics, but not viceversa).
> 
> Yes, but filtering BNodes removes answers.

That's what I said.

> And most users experience and expect the BNodes.

They can get them by using the OWL-DS entailment regime as defined in the current semantics.

>> When treating bnodes as proper existential nodes in the case of the OWL Direct Semantics entailment regime, bnodes should be filtered out from the answer since nothing is known in the research literature about what happens when we leave them in (e.g., they may be infinitely many, since they are not restricted anymore to be bound to to existing nodes or blank nodes in the graph).
> 
> This is at least one reason to not standardize this.

Well, this is exactly what has been done in 25+ years of research in description logics. This was already standardised in syntax and semantics at the beginning of the 90ies as KRSS.

> I'm don't think this is a good idea.. Please note that this is a reversal of a long held position (indeed, I championed non-distinguished variables in the first SPARQL WG).
> On the one hand, adding some sort of (even more?) optional variant that is well known and defined is relatively harmless.

I'd say also that it would be extremely useful to anybody using OWL-2 DS (see below).

> OTOH, standards body should do some picking and choosing. The practical pain induced by having non-distinguished variables is far too high for any known benefits (see the above).

This group is standardising something that has never existed in the literature. I agree that the outcome is nice and sound (and it follows your "story"), but it is utterly useless to anybody willing to work with *proper* BGPs in OWL-2 DS (see my very simple example) -- and I'm sure it costed quite a lot of pain to Birte :-)
On the other hand, having *in addition* the semantics I'm proposing here is simple enough, it relies on 25+ years of research, and it would eventually provide to the OWL-2 DS community a valuable, interesting and very useful query language. Birte claims that she can work on it and propose something very reasonable to the group. Without this addition, SPARQL2 can never be adopted by any of the DB centric applications based on ontology based access technology, since all of these technologies do assume BGPs with proper existential variables. We would loose the entire ontology-based DB integration market based on OWL-DL.

> So, I think it's a disservice to introduce them into these documents.

To whom is this a disservice? I guess that this is the heart of the discussion with you: I'm proposing to add something simple and clear to the current status of the standard, while retaining the current functionalities. I don't see who's going to suffer here if we do so.
Most importantly, note that my proposal is fully compatible with the RDF and SPARQL1 standards, so nothing is broken here. I acknowledge of course that part of the RDF community has some expectations about the semantics, and that's why I'm proposing to keep a way to fulfill these expectations.

> In over 6(!) years of teaching and evangalizing non-distinguished variables, I've not been able to come up or have anyone come up with a compelling example (or even a naturally occurring example) that wasn't tree like and then only by backporting an existential restriction. That suggests that, at the very least, it's premature to standardize.

You may be a very bad teacher, I guess :-)
I don't want to go along the lines on who's a better teacher.

> Interpreting BNodes the way we do matches exactly just about every users expectations (so doesn't surprise them). The BNode patterns we know how to implement are easily captured by class expressions and it's *much* easier for users to understand that distinction than to understand a rather strange synonym for existential restrictions that is also a homonym for something else.

Clearly I understand that you and your customers are not interested to work with *proper* BGPs in OWL-2 DS. 
This is fine, and you (and them) get what you want anyway, since I'm proposing to keep the semantics provided in the current document. 

> Also, I'd be very surprised if implementors wanted to support this. I personally think Pellet should rip out that bit of implementation.

Be surprised: the academic, industry, and system people working on OWL-QL-based systems are already very upset by the limitation of the current version of the standard, and asked me to discuss the matter with the group.

> Finally, since it's so well defined and understood its not like if it becomes suddenly known useful that there'd be any barrier to implementations picking it up.

I fail to understand this argument. Why are we standardising something, if it is already well known? Maybe to facilitate interoperability of acknowledged technologies? :-)

--e.
Received on Tuesday, 30 November 2010 14:20:42 UTC