Re: "Summary for informed outsiders" - Re: Proposed change to the OWL-2 Direct Semantics entailment regime from Enrico Franconi on 2010-12-21 (public-rdf-dawg@w3.org from October to December 2010)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Tue, 21 Dec 2010 09:23:13 +0100
To: Axel Polleres <Axel.Polleres@deri.org>
Cc: SPARQL Working Group <public-rdf-dawg@w3.org>, Maurizio Lenzerini <lenzerini@dis.uniroma1.it>, Guido Vetere <gvetere@it.ibm.com>, Kendall Clark <kendall@clarkparsia.com>
Message-Id: <33C30F7D-CED5-438C-99C8-DFBD880C5E08@inf.unibz.it>
On 21 Dec 2010, at 01:18, Axel Polleres wrote:

> 1) This is about whether bnodes in the data should be visible in answers.

(...)

> Enrico's "*any* OWL-QL or OWL-EL implementation by design incorporates BGPs with OWL Direct Semantics in the manner I'm proposing. Not having BGPs in the manner I'm proposing would force them not to adopt SPARQL for their systems." is in fact not an argument against 1) ... given that the data in such system per definition is bnode-free; *general* bnodes as in RDF in the *data* are not expressible in DLs, nor in RDBMS (which OWL-QL addresses), are they?

Still, if you want to have the label "SPARQL1.1 compliant" your algorithm *has* to deal with bnodes when present, right? 

> If this observation is right, I don't fully understand this last one: "I would be surprised that anybody would want to hack their systems to actually return bnodes." since it seems no particular "hack" is necessary... i.e. if you want to deal with RDF data, just treat bnodes from that data as special constants when you load them into your DL reasoner, and - so it seems - you're done. 

This is simply not correct in the case of systems employing UNA, like 100% of OWL-QL based systems. Note that the fact that OWL2 currently does not have UNA is not an argument, IMHO. In fact, it may well have a variant with UNA, and this should not affect the choices made in SPARQL.

Also consider the fact that if we consider bnodes as proper existential variables (this is point (2) below), then it does not make sense to have them in the answer set, also because there wouldn't be a unique way to answer, and an answer may contain infinite bnodes.

> if you don't like that, don't accept data with bnodes, i.e. don't accept general RDF (any implementation is free to do so).

Indeed, we would like to have the "SPARQL1.1 compliant" label, even in the case OWL deals with pure relational data, and not RDF with genuine bnodes. Please note that in 100% of the large RDF KBs I've seen around, bnodes come from OIDs reifying primary keys from pure relational data, so they are artificial and really just not needed. I still have to see a non crafted real case of bnodes in the current practice of RDF. So, dealing with such bnodes does not give you anything more you couldn't get by just having the proper OIDs as individuals.

> So, I am not 100% sure, but I don't see a big problem or conflict here, really, and would hope we can at least resolve 1) 

Again, note that I am asking for an *additional* regime (call it "pure OWL DS" or something), whose purpose is very clear, and anybody willing to deal with bnodes can resort to the regime you have currently in the proposal.

> 2) nondistinguished variables, i.e. should bnodes in the query be treated special in SPARQL/OWL, as real existential variables.
> 
> One overall problem here is that such non-distinguished variables in the general case of OWL2 DL maybe lead to decidability issues, and - at least for  OWL2 DL - decidability is still a research issue

On the other hand, OWL-QL without nd-vars makes little sense, since entailment reduces to just plain inheritance and this is not interesting

> *Tree-shaped* nd-vars though are possible, at least that's what I understood from our initial call with Birte and Enrico. In fact, also the bnodes in the Iocaste-example that was used in the discussions are tree-shaped, aren't they?

In that example yes, but note that in the OWL-QL case tree-shaped nd-vars or non tree-shaped nd-vars would be treated in the same way, namely the tree-shapedness is not a computational advantage.

> Side remark: I frankly don't understand why the discussion reveals around the
> indeed abstract Iocaste example, when there are much simpler ones:

(snip)
The example is compelling since it can not be solved by grounding bnodes at the beginning: you really need to reason by case. While in your example below a simple grounding with bnodes would solve the problem.

> Ont:
>  "Any Person has a parent who's a Person"
> 
> :Person rdfs:subclassOf 
>     [ a owl:Restriction ; onProperty :parent ; someValuesFrom :Person ] 
> 
> Data: 
> 
> :enrico a :Person . 
> :birte a :Person .
> :bijan a :Person .
> 
> Query: "Give me anyone who has a parent?" 
> 
>  SELECT ?P
>  WHERE { ?P :parent [] }
> 
> would be an example which needs special treatment of nd-variables to return any answers.
> 
> Now, one main argument against bnodes in the query seems to boil down to efficiency of implementations towards the end of the thread?
> (which I don't really understand again, since at the same time you argue that any tree-shaped bnodes in the query can simply be "hidden" in the TBox, which would make them equally expensive?)
> 
> Nobody of you seemed to have argued for tree-shaped bnodes *only*, have you? 
> Would that at all be an acceptable compromise for anyone involved?

Not really, since for OWL-QL having the full power of CQs is the interesting bit.

> I *think* here's were Bijan argues the way I asked above, that actually those can be pushed into the TBox: "I know both Birte and I would be very happy to find substantial, much less, compelling use cases that wouldn't be better handled by explicit class expressions."
> 
> So, asked the other way around, why not allow them and force people to write them into the query?
> 
> Bijan later answers this as follows:
> "The argument against this from a design perspective is that it introduces more irregularity in what queries can be handled by what systems and that the same functionality can be gotten and (IMHO) more clearly expressed with class expressions."
> 
> My impression - without wanting to actually propose this syntactic restriction - is that
> anonymous, i.e. []-expressible bnodes could be allowed, named bnodes can't
> (because with [] bnodes, you *always* remain tree-shaped ... whereas with named bnodes, you can actually form cyclic queries)
> however, as anyther side remark, fobidding named bnodes in SPARQL/OWL might seem a bit awkward wrt. 
> upwards-compatiblity of SPARQL, since SPARQL *does* allow arbitrary bnodes so far... anyways, it seems an option to just 
> disallow named bnodes in SPARQL/OWL queries and treat [] bnodes as nd-variables, which would leave at least a (large?) part 
> of the use cases for nd-variables without introducing (too?) much irregularity in what queries can be handled.

This would be a non backward compatible proposal, which I don't like.

--e.

> What might be interesting to know for making an informed decision is whether Guido and 
> Maurizio, for their practical applications, need non-tree-shaped nd-variables for their queries?
> 
> Any comments welcome whether I got anything wrong in this "summary for/from the informed outsiders", 
> I tried to keep it as neutral as possible (if you disagree, also let me know)
> 
> Axel
> 
> 
> 
> 
>
Received on Tuesday, 21 December 2010 08:24:02 UTC