Re: Blank node identifiers in FILTER clauses from Eric Prud'hommeaux on 2006-07-24 (public-rdf-dawg@w3.org from July to September 2006)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Mon, 24 Jul 2006 14:37:32 -0400
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: Enrico Franconi <franconi@inf.unibz.it>, RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-ID: <20060724183732.GA9332@w3.org>
On Mon, Jul 17, 2006 at 06:46:39PM +0100, Bijan Parsia wrote:
> 
> Eric, you seem to have a number of confusions that have gotten all  
> rather severely tangled up. Some are based on simple ignorance (e.g.,  
> of the distinction between distinguished and non-distinguished  
> variables, which explains the behavior of Pellet), but then others  
> compound on those errors. I'm just going to try to hit the highlights  
> tersely, because I don't see that trying to untangle everything  
> inline will actually help. Plus it's too much work :)
> 
> I shall also prune rather ruthelessly.
> 
> On Jul 16, 2006, at 6:13 PM, Eric Prud'hommeaux wrote:
> 
> >
> >On Fri, Jul 14, 2006 at 05:37:16PM +0200, Eric Prud'hommeaux wrote:
> >>
> >>On Fri, Jul 14, 2006 at 02:48:10AM +0100, Enrico Franconi wrote:
> >>>
> >>>On 13 Jul 2006, at 17:23, Eric Prud'hommeaux wrote:
> [snip]
> >>>>SELECT ?Meal ?WineColor
> >>>>WHERE {
> >>>>  ?Meal food:hasDrink _:Wine .
> >>>>  _:Wine wine:hasColor ?WineColor }
> 
> _:Wine is treated as a nondistinguished variables. That is, it does  
> not need to have a named entity as a binding and it cannot return  
> bindings.
> 
> >>>>Pellet <http://www.mindswap.org/2003/pellet/demo> gives these
> >>>>results:
> >>>>
> >>>>|  Meal 	| WineColor |
> >>>>+---------------+-----------+
> >>>>| test:MyLunch 	| :White    |
> >>>>| test:MyDinner	| :Red	    |
> 
> Which explains why it returns these values for this. There is no  
> named Wine in the KB that can be bound to this variable, but it is  
> provable that there must be *some* such wine in all models.
> 
> >>>>When querying Pellet for:,
> >>>>  ?Meal food:hasDrink ?Wine .
> >>>>  ?Wine wine:hasColor ?WineColor
> >>>>
> >>>>it gives no results because it has no bindings for the Wine.
> 
> ?Wine is treated as a distinguished variable. In this sense it  
> doesn't not quite conform to the SPARQL semantics, but since there  
> are no SPARQL semantics for OWL DL, and quasi-distinguished variables  
> are a new phenomenon, I think this is acceptible. Distinguished  
> variables bind *only* to named entities, thus their bindings may not  
> be BNodes, which are *not* names, but existential variables.

SPARL Query 2.7 Blank Nodes in Query Results gives examples of
bindings to blank nodes, as does XML Query Results 2.3.1. Variable
Binding Results (per the RDF Semantics definition of bNodes [BN]).


[BN] http://www.w3.org/TR/2004/REC-rdf-mt-20040210/#unlabel

> >>>>But why
> >>>>not?
> 
> Because of the semantics of the variables.
> 
> >>>>Certainly, it as deduced that there is something there, but it's
> >>>>opaque to RDF. Why doesn't it infer the triples:?
> >>>> food:MyLunch hasDrink [ hasColor :White ] .
> >>>> food:MyDinner hasDrink [ hasColor :Red ] .
> 
> Well, it does infer something similar, but all this is besides the  
> point.
> 
> If the scoped set for the OWL DL entailment regime contains BNodes  
> (making ?vars quasi-distinguished), then we should expect the above  
> two queries to return with bindings. If not, then the current  
> behavior is correct. Since it's not yet defined, I think we're ok :)

I would expect that the entailment for RDF would be sufficient
here. That is, the triple pattern
  food:MyLunch hasDrink ?x .
  ?x hasColor :White .
should match 
  food:MyLunch hasDrink _:q .
  _:q hasColor :White .

> >>>You can't do that since you can not represent all the possible
> >>>answers. You couldn't represent the case when there is a cyclic
> >>>coreference between bnodes.
> >>>E.g., _:a R _:b.
> >>>      _:b S _:a.
> >>
> >>Perhaps I misinterpret here, but I thought that would be an argument
> >>against materializing a closure and then querying it, as opposed to
> >>proving the existence of certain triples.
> 
> The above KB is not a legal OWL DL KB, so it cannot be entailed. (See  
> the transformation to triples:
> 	<http://www.w3.org/TR/owl-semantics/mapping.html#4.1>
> """Bnode identifiers here must be taken as local to each  
> transformation, i.e., different identifiers should be used for each  
> invocation of a transformation rule. Ontologies without a name are  
> given a bnode as their main node; ontologies with a name use that  
> name as their main node; in both cases this node is referred to as O  
> below.""")

Hmm, I read this as a directive about the notation in the mapping
table. That is, I expec that 

> >>What query pattern could one match with bNodes and SPARQL's current
> >>entailment rules for matching that one couldn't match with variables
> >>against the inferred graph? (Note, showing me why Pellet should not
> >>infer
> >>  food:MyLunch hasDrink [ hasColor :White ] .
> >>  food:MyDinner hasDrink [ hasColor :Red ] .
> >>would be one way to answer this question.)
> >
> >In order to give you more to argue with, I've worked out some triples
> >that I think are implied by your little house example.
> >
> >We were trying match:
> >  Paul :hasFriend _:Y .
> >  _:Y rdf:type Employee .
> >  _:Y hasFriend _:Z .
> >  _:Z rdf:type Manager .
> 
> This isn't cyclic.

Agreed. Perhaps I should have worked harder on the segue.

We seem to be focusing our discussion of OWL semantics around two
examples, the wine color example in the Pellet demo, and Enrico's
"little house". I was just stating how I think some piece of knowledge
in the little house example manifests in RDF.

> [snip]
> >I think you should be able to infer it, and then query it:
> >SELECT ?X
> >WHERE { ?X rdf:type Worker .
> >        ?X :hasFriend ?Y .
> >        ?Y rdf:type Employee .
> >        ?Y :hasFriend ?Z .
> >        ?Z rdf:type Manager } # note ?vars
> 
> This is irrelevant if you interpret the variables as distinguished.

Are you saying that one can query the little house inference without
the current semantics of indistinguished variables? Or are you
agreeing that, if the inference manifests in an RDF graph, that the
triples can be matched by either variables or bNodes?

> >I'm really surprised this is not how people currently handle OWL
> >semantics.
> 
> I suspect it's because you don't understand OWL semantics, and have  
> not studied how to implement a sound and complete reasoner. Or  
> rather, you don't recognize the distinction between an anonymous or  
> generated individual in a model, and a BNode.

You're right. Until we get to counting, I don't see how the use of
bNodes is insufficient for representing the graph pattern that we are
currently matching with bNodes in the the SPARQL pattern.

> >bNodes are perfect for this.
> 
> Nope.
> 
> >If anyone wrote a backward-
> >chaining OWL engine, it seems they'd have to do this.

oops, temporary taxonomy failure...
s/backward-chaining/foward-chaining/g

> Well, it really doesn't make sense to talk of "backward chaining"  
> since you just don't chain at all, really. Pellet does do some data  
> driven inference (e.g., classification and realization) and some on  
> demand inference (e.g., arbitrary entailments, and queries at the  
> moment), but it's not really a helpful way to think about it.

Right, I was just trying to understand why cyclicity came up at all.

> >(Come to think
> >of it, I think cwm is basically backward-chaining.)
> 
> Nope.
> <http://www.w3.org/2000/10/swap/doc/cwm.html>
> 
> """Cwm (pronounced coom) is a general-purpose data processor for the  
> semantic web, somewhat like sed, awk, etc. for text files or XSLT for  
> XML. It is a forward chaining reasoner..."""
> 
> >Or maybe I just don't get something...
> [snip]
> 
> Yep. Looks like the cyclicity.

Could you give a concrete example?

> Cheers,
> Bijan.

-- 
-eric

home-office: +1.617.395.1213 (usually 900-2300 CET)
	    +33.1.45.35.62.14
cell:       +33.6.73.84.87.26

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.
Received on Monday, 24 July 2006 18:36:16 UTC