Re: bnodeification from Souripriya Das on 2006-01-12 (public-rdf-dawg@w3.org from January to March 2006)

From: Souripriya Das <souripriya.das@oracle.com>
Date: Thu, 12 Jan 2006 11:07:20 -0500
To: Sergio Tessaris <tessaris@inf.unibz.it>, Enrico Franconi <franconi@inf.unibz.it>
CC: Pat Hayes <phayes@ihmc.us>, Bijan Parsia <bparsia@isr.umd.edu>, public-rdf-dawg@w3.org
Message-ID: <43C67EB8.6050206@oracle.com>
Enrico, Sergio,

I have two basic questions. If you could explain very clearly (in a 
tutorial kind of manner), if possible, it will be very helpful. Also, 
any specific pointers relevant to these questions.

1) What are 'implicit existentials' and how OWL-Lite has it, but RDFS 
does not? [Ref: Enrico's e-mail: 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0080.html]

2) You are using SPARQL blank nodes to specify existential variables. a) 
What are the conceptual similarities (and any differences) between 
SPARQL blank nodes (existential variables) and RDF blank nodes? b) What 
are the differences between distinguished variables and existential 
variables? [Ref. Sergio's e-mail: 
http://lists.w3.org/Archives/Public/public-rdf-dawg/2006JanMar/0081.html]

Thanks,
- Souri.

Sergio Tessaris wrote:

>
>
> On 11 Jan 2006, at 23:32, Pat Hayes wrote:
>
>>> Hi Pat,
>>>
>>> Pat Hayes wrote:
>>>
>>>>  -----------
>>>>
>>>>  A binding on a pattern Q is a map from the variables and bnodes  
>>>> in Q to
>>>>  terms. The result B(Q) of applying the binding B to Q is the RDF  
>>>> graph
>>>>  obtained by replacing each variable or bnode in Q by its value  
>>>> under B.
>>>>
>>>>  An answer binding with respect to a scoping graph G' is a  binding 
>>>> B such
>>>>  that:
>>>
>>>  > (1)  any bnodeIDs introduced by B occur in G'
>>>  > (2)  G entails (G' union B(Q)).
>>>
>>> It works for RDF(S), but I'm uneasy with the idea of extending  
>>> mappings
>>> with bnode names. In this way all the the variables in the query  
>>> became
>>> distinguished variables and no more true existentials.
>>
>>
>> Logically, there is no distinction between variables and  
>> existentials in a query (i.e. on the RHS of a Gentzen sequent),  
>> surely? Variables are just existentials that we are asking to see  
>> bindings for.
>
>
> The distinction is on the scope of distinguished variables w.r.t.  
> existential variables. In  KR systems (or incomplete databases)  
> distinguished variables range over the known constants (objects) in  
> the language. The reason is that they're the only element for which  
> you can actually prove that they have satisfy the requested  
> properties across all the possible models.
>
> With RDF(S) and SPARQL we're going into a middle way, by taking into  
> account the bnodes. As you remember we were contrary to allow them in  
> query answers, since in that way we're moving into uncharted  
> territories. However, as it turned out we can handle them in a fairly  
> satisfactory way by considering them a sort of skolem constants with  
> some juggling. In the  traditional KR (including DLs) framework I'd  
> say that they're sort of 'individual' without the unique name  
> assumption (there are some subtleties, but we can live with them). In  
> this way we can handle them in the OWL setting as well.
>
> Going back with the entailment business, I was to quick in embracing  
> the G' idea. The problem is that bnode renaming (that's what G' is  
> all about) breaks the SPARQ algebra.
>
> Pat definition is:
>
> """
> An answer binding with respect to a scoping graph G' is a binding B  
> such that:
> (1)  any bnodeIDs introduced by B occur in G'
> (2)  G entails (G' union B(Q)).
> """
>
> which means that you don't guarantee that bnode names in B are  
> exactly the ones in G, but only that there is an one to one mapping  
> between those names and the one in G (renaming is allowed).
>
> Unfortunately, this works only for basic graph patterns. When we  
> combine the results with the SPARQL operators we obtain unexpected  
> results. Consider the dataset:
>
> G = {
> _:a foaf:nick "Bijan"
> _:a :email "badguy@moxie.com"
> }
>
> and the query
>
> SELECT * { {?x :email ?e } UNION {?x foaf:nick ?n } }
>
> to answer the query we should take the union of answers to P1={? x 
> :email ?e} and to P2={?x :email ?e}. By allowing arbitrary renaming  
> of bnodes we can obtain
>
> B1 = [?x/_:b12, ?e/"badguy@moxie.com"]
>
> because G' entails (G' union B1(P1)), where G' is
>
> G' = {
> _:b12 foaf:nick "Bijan"
> _:b12 :email "badguy@moxie.com"
> }
>
> and
>
> B2 = [?x/_:b36, ?n/"Bijan"]
>
> because G'' entails (G'' union B1(P1)), where G'' is
>
> G'' = {
> _:b36 foaf:nick "Bijan"
> _:b36 :email "badguy@moxie.com"
> }
>
> Therefore, the solution to the overall query is { B1 } union { B2 }:
>
> { [?x/_:b12, ?e/"badguy@moxie.com"], [?x/_:b36, ?n/"Bijan"] }
>
> Bottom line is that we have to deal with the same name across  
> "evaluation" of distinct BGP, and this means having a single graph G.
>
> --sergio
>
>>
>>> It doesn't make any difference with RDF and RDF(S), but with OWL for
>>> example you don't get a true conjunctive query language.
>>>
>>> E.g. given the KB [:a rdf:type (Some :r :c)], the query { ?x :r _:b }
>>> has no answers.
>>
>>
>> It seems to me that SELECT ?x {?x :r ?b} should have an answer just  
>> when SELECT ?x {?x :r _:b} does, since we allow bnodes as binding  
>> values to variables. So this query should have an answer, which  
>> would simply utilize a binding of _:b to another bnodeID  
>> (conceptually, in the 'OWL-RDF comprehension closure' of the KB).
>>
>> This is exactly why I am worried about imposing condition (1) on  OWL 
>> entailment, which I tried to explain in the telecon. Seems to  me 
>> that this strict condition (1) (which isn't, to repeat,  logically 
>> necessary, but is there only to keep redundancy to a  minimum) is not 
>> going to be reasonable past RDFS, since OWL/RDF has  built-in 
>> semantic conditions stated as comprehension principles  which 
>> effectively assert the existence of an infinite amount of OWL  stuff 
>> which is not mentioned explicitly in, but logically required  to 
>> exist by, an OWL/RDF graph. These include such things as the  lists 
>> used to encode OWL syntax, and also in this case, the :r  value from 
>> :a that must exist in order that the 'Some' be  satisfied; and then a 
>> bnode representing this known-to-exist thing  is a legitimate binding 
>> for the variable/bnode in the query.   Certainly any engine which is 
>> capable of inferring that _:b exists  would surely be capable of 
>> generating an appropriate bnodeID for  that thing and returning that 
>> as a binding, if required. But of  course, we do not require it for 
>> query bnodes since they cannot be  selected.
>>
>> Once we have semantic conditions in the language which guarantee  
>> that things exist which are not mentioned by name in a KB, we will  
>> need to relax condition (1) to allow these 'necessary' existents to  
>> be referred to in answer bindings. The case seems to me to be  
>> exactly similar to the case of rdf: and rdfs: IRIs in axiomatic  
>> triples in RDF and RDFS, and even more similar to function symbols  
>> in FOL, where answers bindings can be terms in the Herbrand  universe 
>> generated from skolem functions. If one forbids functions,  the same 
>> condition becomes a requirement like this on the  generation of 
>> existential 'names' in relational argument positions.  All this 
>> amounts to in practice is that we allow KBs to have a  'stock' of 
>> spare bnodeIDs they can use as answer bindings to denote  things 
>> known to exist, and that are considered to occur in the  dataset.
>>
>> I agree this whole topic is potentially debateable, but it seems to  
>> me that we can (must?) leave any debates about exactly how to  handle 
>> OWL queries to a later date. I have no confidence that ANY  of our 
>> proposed definitions will work EXACTLY as stated for OWL  queries 
>> without some changes, if only minor ones. OWL querying is  still an 
>> open issue. In the meantime, let us try to keep the  definitions as 
>> simple and clear and robust as we can. I am still  very nervous of 
>> relying on the 'ordered merge' trick to keep bnode  scopes aligned 
>> properly: it is too much like an algorithmic  specification to be 
>> robust enough to seem comfortable.
>>
>>> Moreover, the algebra definition gets more complicate since you  
>>> need to
>>> project over the true variables all the times.
>>
>>
>> True, but this can be easily handled by a suitable definition of  
>> 'variable binding' or some such which contains the projection, if  
>> required.
>>
>> Pat
>>
>> -- 
>> ---------------------------------------------------------------------
>> IHMC        (850)434 8903 or (650)494 3973   home
>> 40 South Alcaniz St.    (850)202 4416   office
>> Pensacola            (850)202 4440   fax
>> FL 32502            (850)291 0667    cell
>> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>
> Sergio Tessaris, Assistant Professor        Faculty of Computer Science
> e-mail: tessaris@inf.unibz.it          Free University of Bozen-Bolzano
> http://www.inf.unibz.it/~tessaris                   Piazza Domenicani 3
> tel: +39 0471 016 125 (fax 009)                    39100 Bolzano, Italy
>
>
>
>
Received on Thursday, 12 January 2006 16:08:02 UTC