"Summary for informed outsiders" - Re: Proposed change to the OWL-2 Direct Semantics entailment regime

Dear all,

I am - in preparation for tomorrow's call - trying to sum up the discussion here... 
and am sure you won't blame me for some maybe naive conclusions (on which commments from either side are certainly very welcome)

It is obvious that there are *two* things being discussed in parallel here and somehow they seem to be mixed:

 1) bnodes from the data in results, i.e. treating bnodes in the data
 as constants

 2) bnodes as non-distinguished variables in the query

Let me attempt to analyse the discussions revolving around those two points separately
(please forgive me and let me know if you feel quoted out of context):

1) This is about whether bnodes in the data should be visible in answers.

Bluntly spoken: 

Data:

 _:a :p :o 

SELECT * {?S :p :o}

Should that return an answer in SPARQL/OWL or no?

 Birte/bijan: yes
 Enrico:      no 

Birte's "a more expressive one should just give you more answers I think."
seems to be a very strong argument for "yes" particularly for the people approaching SPARQL-OWL from the RDF side.

In turn, I haven't seen a real argument against this, or I missed it.
Enrico's 
"*any* OWL-QL or OWL-EL implementation by design incorporates BGPs with OWL Direct Semantics in the manner I'm
proposing. Not having BGPs in the manner I'm proposing would force them not to adopt SPARQL for their systems."
is in fact not an argument against 1) ... given that the data in such system per definition is bnode-free;
*general* bnodes as in RDF in the *data* are not expressible in DLs, nor in RDBMS (which OWL-QL addresses), are they?
If this observation is right, I don't fully understand this last one:
"I would be surprised that anybody would want to hack their systems to actually return bnodes."
since it seems no particular "hack" is necessary... i.e. if you want to deal with RDF data, just treat bnodes 
from that data as special constants when you load them into your DL reasoner, and - so it seems - you're done. 
if you don't like that, don't accept data with bnodes, i.e. don't accept general RDF (any implementation is free to do so).

So, I am not 100% sure, but I don't see a big problem or conflict here, really, and would hope we can at least resolve 1) 

2) nondistinguished variables, i.e. should bnodes in the query be treated special in SPARQL/OWL, as real existential variables.

One overall problem here is that such non-distinguished variables in the general case of OWL2 DL maybe lead to 
decidability issues, and - at least for  OWL2 DL - decidability is still a research issue

*Tree-shaped* nd-vars though are possible possible, at least that's what I
understood from our initial call with Birte and Enrico. In fact, also the bnodes in the 
Iocaste-example that was used in the discussions are tree-shaped, aren't they?

But: Birte/Bijan are arguing they aren't necessary and that this example is artificial, yes?

Side remark: I frankly don't understand why the discussion reveals around the
indeed abstract Iocaste example, when there are much simpler ones:

 
Ont:
  "Any Person has a parent who's a Person"

 :Person rdfs:subclassOf 
     [ a owl:Restriction ; onProperty :parent ; someValuesFrom :Person ] 

Data: 
  
 :enrico a :Person . 
 :birte a :Person .
 :bijan a :Person .

Query: "Give me anyone who has a parent?" 

  SELECT ?P
  WHERE { ?P :parent [] }

would be an example which needs special treatment of nd-variables to return any answers.

Now, one main argument against bnodes in the query seems to boil
down to efficiency of implementations towards the end of the thread?
(which I don't really understand again, since at the same time you argue that 
any tree-shaped bnodes in the query can simply be "hidden" in the TBox, 
which would make them equally expensive?)

Nobody of you seemed to have argued for tree-shaped bnodes *only*, have you? 
Would that at all be an acceptable compromise for anyone involved?

I *think* here's were Bijan argues the way I asked above, that
actually those can be pushed into the TBox:
"I know both Birte and I would be very happy to find substantial, 
 much less, compelling use cases that wouldn't be better handled by explicit class expressions."

So, asked the other way around, why not allow them and force people to write them into the query?

Bijan later answers this as follows:
"The argument against this from a design perspective is that it
introduces more irregularity in what queries can be handled by what
systems and that the same functionality can be gotten and (IMHO) more
clearly expressed with class expressions."

My impression - without wanting to actually propose this syntactic restriction - is that
 anonymous, i.e. []-expressible bnodes could be allowed, named bnodes can't
(because with [] bnodes, you *always* remain tree-shaped ... whereas with named bnodes, you can actually form cyclic queries)
however, as anyther side remark, fobidding named bnodes in SPARQL/OWL might seem a bit awkward wrt. 
upwards-compatiblity of SPARQL, since SPARQL *does* allow arbitrary bnodes so far... anyways, it seems an option to just 
disallow named bnodes in SPARQL/OWL queries and treat [] bnodes as nd-variables, which would leave at least a (large?) part 
of the use cases for nd-variables without introducing (too?) much irregularity in what queries can be handled.

What might be interesting to know for making an informed decision is whether Guido and 
Maurizio, for their practical applications, need non-tree-shaped nd-variables for their queries?


Any comments welcome whether I got anything wrong in this "summary for/from the informed outsiders", 
I tried to keep it as neutral as possible (if you disagree, also let me know)

Axel

Received on Tuesday, 21 December 2010 00:18:47 UTC