Re: Blank node identifiers in FILTER clauses (ooooops 2) from Enrico Franconi on 2006-07-17 (public-rdf-dawg@w3.org from July to September 2006)

From: Enrico Franconi <franconi@inf.unibz.it>
Date: Mon, 17 Jul 2006 22:59:22 +0200
To: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <33DF93AA-8AFE-4348-8572-3921102E3E56@inf.unibz.it>

(bad hair day, today)

On 17 Jul 2006, at 22:41, Enrico Franconi wrote:

> On 16 Jul 2006, at 19:13, Eric Prud'hommeaux wrote:
>> In order to give you more to argue with, I've worked out some triples
>> that I think are implied by your little house example.
>>
>> We were trying match:
>>
>>   Paul :hasFriend _:Y .
>>   _:Y rdf:type Employee .
>>   _:Y hasFriend _:Z .
>>   _:Z rdf:type Manager .
>>
>> There are two possibile interpreations relavent to proving this.
>> Either the left or right of:
>>
>>   Paul rdf:type Worker .        Paul rdf:type Worker .
>>   Paul :hasFriend Andrea .      Paul :hasFriend Simon .
>>   Andrea rdf:type Employee .    Simon rdf:type Employee .
>>   Andrea :hasFriend Caroline .  Simon :hasFriend Andrea .
>>   Caroline rdf:type Manager .   Andrea rdf:type Manager .
>>
>> Since all sides of the disjunction entail
>>
>>   Paul :hasFriend _:Y .
>>   _:Y rdf:type Employee .
>>   _:Y hasFriend _:Z .
>>   _:Z rdf:type Manager .
>>
>> I think you should be able to infer it, and then query it:
>>
>> SELECT ?X
>> WHERE { ?X rdf:type Worker .
>>         ?X :hasFriend ?Y .
>>         ?Y rdf:type Employee .
>>         ?Y :hasFriend ?Z .
>>         ?Z rdf:type Manager } # note ?vars
>
> As mentioned by Bijan, if E-entailment in SPARQL is standard OWL-DL  
> ABox entailment (which is at the heart of Pellet), then the well- 
> formedness condition for "(G' union S(BGP'))" (section 5.1 in rq24)  
> says that only individuals can be used in the answer set - no  
> bnodes. In this context, the above query (as correctly it does in  
> Pellet) will return the empty set. This has been already discussed  
> at length in the past (see, e.g., <http://lists.w3.org/Archives/ 
> Public/public-rdf-dawg/2006JanMar/0305>). Non-distinguished  
> variables *do* provide more expressive power when querying RDF  
> graphs whose semantics imply the existence of implicit bnodes, like  
> in the little house example. If your query language does not allow  
> bnodes in the answers, then you can see how it is impossible to  
> query the little house.

My above answer makes sense only if all the variables in the query  
are distinguished (SELECT *). However, in your example only ?X is a  
disinguished variable (SELECT ?X), and therefore YES, the correct  
answer should be, and is, {?X/Paul}, since the non-distinguished  
variables ?Y and ?Z are interpreted existentially.
However, note that in order to write the above query you already  
should have in mind the answer you expect. In other words, while  
writing a query you can not expect the user to think in advance how  
the answer could be and write the query accordingly. It can be easily  
seen that if you want to reproduce with non-distingushed variables  
the behaviour of bnodes in the answer set (like in the above  
example), then the query with non-distinguished variables may be  
exponentially larger than the answer set with bnodes. If you think  
about it, this is due to the all possible different coreferences  
between bnodes in the answer that you have to explicitly represent in  
the query (this has also to do with the cycles in the answer set I  
was mentioning a couple of emails ago).

cheers
--e.

Received on Monday, 17 July 2006 20:59:38 UTC