Re: bnodeification from Pat Hayes on 2006-01-12 (public-rdf-dawg@w3.org from January to March 2006)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 12 Jan 2006 10:02:56 -0600
To: Sergio Tessaris <tessaris@inf.unibz.it>
Cc: Enrico Franconi <franconi@inf.unibz.it>, Bijan Parsia <bparsia@isr.umd.edu>, souripriya.das@oracle.com, public-rdf-dawg@w3.org
Message-Id: <p06230901bfec29e871db@[10.100.0.5]>
>On 11 Jan 2006, at 23:32, Pat Hayes wrote:
>
>>>Hi Pat,
>>>
>>>Pat Hayes wrote:
>>>
>>>>  -----------
>>>>
>>>>  A binding on a pattern Q is a map from the variables and bnodes in Q to
>>>>  terms. The result B(Q) of applying the binding B to Q is the RDF graph
>>>>  obtained by replacing each variable or bnode in Q by its value under B.
>>>>
>>>>  An answer binding with respect to a scoping graph G' is a binding B such
>>>>  that:
>>>  > (1)  any bnodeIDs introduced by B occur in G'
>>>  > (2)  G entails (G' union B(Q)).
>>>
>>>It works for RDF(S), but I'm uneasy with the idea of extending mappings
>>>with bnode names. In this way all the the variables in the query became
>>>distinguished variables and no more true existentials.
>>
>>Logically, there is no distinction between 
>>variables and existentials in a query (i.e. on 
>>the RHS of a Gentzen sequent), surely? 
>>Variables are just existentials that we are 
>>asking to see bindings for.
>
>The distinction is on the scope of distinguished 
>variables w.r.t. existential variables. In  KR 
>systems (or incomplete databases) distinguished 
>variables range over the known constants 
>(objects) in the language.

That sounds like a use/mention confusion to me, 
but perhaps Im reading too much into 'object'. 
But what you say here does not apply to any KR 
system which uses function symbols, for example 
Prolog.

>The reason is that they're the only element for 
>which you can actually prove that they have 
>satisfy the requested properties across all the 
>possible models.

I do not understand your point, or maybe I 
disagree with it. Bnodes have a model-theoretic 
semantics, and valid entailments (valid = in all 
possible models) hold between graphs containing 
bnodes.

>With RDF(S) and SPARQL we're going into a middle 
>way, by taking into account the bnodes. As you 
>remember we were contrary to allow them in query 
>answers, since in that way we're moving into 
>uncharted territories. However, as it turned out 
>we can handle them in a fairly satisfactory way 
>by considering them a sort of skolem constants 
>with some juggling. In the  traditional KR 
>(including DLs) framework I'd say that they're 
>sort of 'individual' without the unique name 
>assumption (there are some subtleties, but we 
>can live with them).

I think you are making this needlessly 
complicated. There is a very simple, clear, 
logical picture which underlies all these 
languages (RDF through OWL and beyond) and in 
which bnodes are simply existentially bound 
names, treated as 'blank' because, essentially, 
they are understood to be bound. All the 
complications we are dealing with here are 
entirely due to the need to determine the scopes 
of different existential bindings, in effect. 
None of the languages incorporate a unique name 
assumption, so using this as a criterion to 
distinguish bnodes from names must be spurious.

>  In this way we can handle them in the OWL setting as well.
>
>Going back with the entailment business, I was 
>to quick in embracing the G' idea. The problem 
>is that bnode renaming (that's what G' is all 
>about) breaks the SPARQ algebra.

No, I don't think so. The scoping graph is a 
parameter of the definitions, so we hav some 
flexibility. We just have to say that all the 
answer sets in a UNION use the same scoping 
graph. None of this requires that it actually be 
G itself, and I don't think we should require 
that.

>
>Pat definition is:
>
>"""
>An answer binding with respect to a scoping graph G' is a binding B such that:
>(1)  any bnodeIDs introduced by B occur in G'
>(2)  G entails (G' union B(Q)).
>"""
>
>which means that you don't guarantee that bnode 
>names in B are exactly the ones in G, but only 
>that there is an one to one mapping between 
>those names and the one in G (renaming is 
>allowed).
>
>Unfortunately, this works only for basic graph 
>patterns. When we combine the results with the 
>SPARQL operators we obtain unexpected results.

See above. I agreee we have to specify scopes and 
cannot leave them undefined or presume they must 
be disjoint inappropriately.

>Consider the dataset:
>
>G = {
>_:a foaf:nick "Bijan"
>_:a :email "badguy@moxie.com"
>}
>
>and the query
>
>SELECT * { {?x :email ?e } UNION {?x foaf:nick ?n } }
>
>to answer the query we should take the union of 
>answers to P1={?x :email ?e} and to P2={?x 
>:email ?e}. By allowing arbitrary renaming of 
>bnodes we can obtain
>
>B1 = [?x/_:b12, ?e/"badguy@moxie.com"]
>
>because G' entails (G' union B1(P1)), where G' is
>
>G' = {
>_:b12 foaf:nick "Bijan"
>_:b12 :email "badguy@moxie.com"
>}
>
>and
>
>B2 = [?x/_:b36, ?n/"Bijan"]
>
>because G'' entails (G'' union B1(P1)), where G'' is
>
>G'' = {
>_:b36 foaf:nick "Bijan"
>_:b36 :email "badguy@moxie.com"
>}

Well, we should require that G' and G'' are the 
same graph in such a case, clearly. And it is 
clear WHY we should, because the role of the 
scoping graph is to define bnode scopes, and we 
want these to be the same scope.

>Therefore, the solution to the overall query is { B1 } union { B2 }:
>
>{ [?x/_:b12, ?e/"badguy@moxie.com"], [?x/_:b36, ?n/"Bijan"] }
>
>Bottom line is that we have to deal with the 
>same name across "evaluation" of distinct BGP, 
>and this means having a single graph G.

Single graph, yes, but not necessarily G itself.

Pat

>
>--sergio
>
>>
>>>It doesn't make any difference with RDF and RDF(S), but with OWL for
>>>example you don't get a true conjunctive query language.
>>>
>>>E.g. given the KB [:a rdf:type (Some :r :c)], the query { ?x :r _:b }
>>>has no answers.
>>
>>It seems to me that SELECT ?x {?x :r ?b} should 
>>have an answer just when SELECT ?x {?x :r _:b} 
>>does, since we allow bnodes as binding values 
>>to variables. So this query should have an 
>>answer, which would simply utilize a binding of 
>>_:b to another bnodeID (conceptually, in the 
>>'OWL-RDF comprehension closure' of the KB).
>>
>>This is exactly why I am worried about imposing 
>>condition (1) on OWL entailment, which I tried 
>>to explain in the telecon. Seems to me that 
>>this strict condition (1) (which isn't, to 
>>repeat, logically necessary, but is there only 
>>to keep redundancy to a minimum) is not going 
>>to be reasonable past RDFS, since OWL/RDF has 
>>built-in semantic conditions stated as 
>>comprehension principles which effectively 
>>assert the existence of an infinite amount of 
>>OWL stuff which is not mentioned explicitly in, 
>>but logically required to exist by, an OWL/RDF 
>>graph. These include such things as the lists 
>>used to encode OWL syntax, and also in this 
>>case, the :r value from :a that must exist in 
>>order that the 'Some' be satisfied; and then a 
>>bnode representing this known-to-exist thing is 
>>a legitimate binding for the variable/bnode in 
>>the query.  
>>Certainly any engine which is capable of 
>>inferring that _:b exists would surely be 
>>capable of generating an appropriate bnodeID 
>>for that thing and returning that as a binding, 
>>if required. But of course, we do not require 
>>it for query bnodes since they cannot be 
>>selected.
>>
>>Once we have semantic conditions in the 
>>language which guarantee that things exist 
>>which are not mentioned by name in a KB, we 
>>will need to relax condition (1) to allow these 
>>'necessary' existents to be referred to in 
>>answer bindings. The case seems to me to be 
>>exactly similar to the case of rdf: and rdfs: 
>>IRIs in axiomatic triples in RDF and RDFS, and 
>>even more similar to function symbols in FOL, 
>>where answers bindings can be terms in the 
>>Herbrand universe generated from skolem 
>>functions. If one forbids functions, the same 
>>condition becomes a requirement like this on 
>>the generation of existential 'names' in 
>>relational argument positions. All this amounts 
>>to in practice is that we allow KBs to have a 
>>'stock' of spare bnodeIDs they can use as 
>>answer bindings to denote things known to 
>>exist, and that are considered to occur in the 
>>dataset.
>>
>>I agree this whole topic is potentially 
>>debateable, but it seems to me that we can 
>>(must?) leave any debates about exactly how to 
>>handle OWL queries to a later date. I have no 
>>confidence that ANY of our proposed definitions 
>>will work EXACTLY as stated for OWL queries 
>>without some changes, if only minor ones. OWL 
>>querying is still an open issue. In the 
>>meantime, let us try to keep the definitions as 
>>simple and clear and robust as we can. I am 
>>still very nervous of relying on the 'ordered 
>>merge' trick to keep bnode scopes aligned 
>>properly: it is too much like an algorithmic 
>>specification to be robust enough to seem 
>>comfortable.
>>
>>>Moreover, the algebra definition gets more complicate since you need to
>>>project over the true variables all the times.
>>
>>True, but this can be easily handled by a 
>>suitable definition of 'variable binding' or 
>>some such which contains the projection, if 
>>required.
>>
>>Pat
>>
>>--
>>---------------------------------------------------------------------
>>IHMC		(850)434 8903 or (650)494 3973   home
>>40 South Alcaniz St.	(850)202 4416   office
>>Pensacola			(850)202 4440   fax
>>FL 32502			(850)291 0667    cell
>>phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
>>
>
>Sergio Tessaris, Assistant Professor        Faculty of Computer Science
>e-mail: tessaris@inf.unibz.it          Free University of Bozen-Bolzano
>http://www.inf.unibz.it/~tessaris                   Piazza Domenicani 3
>tel: +39 0471 016 125 (fax 009)                    39100 Bolzano, Italy


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 12 January 2006 16:04:34 UTC