Re: union (Re: Review RDF 1.1 Semantics (ED 3rd June 2013)) with an addition from Pat Hayes on 2013-06-19 (public-rdf-wg@w3.org from June 2013)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 19 Jun 2013 12:35:40 -0500
To: Peter Patel-Schneider <pfpschneider@gmail.com>, Antoine Zimmermann <antoine.zimmermann@emse.fr>
Cc: RDF WG <public-rdf-wg@w3.org>
Message-Id: <0601DA6F-2EFF-4367-993A-0EF5F801577C@ihmc.us>
Further to this, a new draft is now posted using this insight in a technical note just after bnode truth conditions are defined. It also introduces the notion of a unionizing of a set of graphs and uses that for simple interpretation. Antoine, this is the conventional definition applied to real quantifer scopes. 

Pat

On Jun 19, 2013, at 11:52 AM, Pat Hayes wrote:

> Peter, just before the call ended today, you came up with an elegant observation that deserves being spread more widely. I will try redrafting the document to include this observation. 
> 
> =======
> There are no quantifiers in RDF, and therefore no quantifier scopes. Blank nodes in RDF are like *free* variables in FOL, with the blanket understanding that free variables are understood to be existentially quantified. We do not specify exactly where those understood quantifers are located, so we simply use that blanket assumption on each graph that we need to use it on.
> =======
> 
> This is an uncommon syntax convention for FOL, although it has been used before. In fact it is one of the oldest (It was the one used by Peirce in 1885). 
> 
> That means that taking the union of two graphs is exactly the conjunction operation in FOL, in this syntax; that is, it is the operation of writing the two graphs with an "&" sign between them. It does however mean that a conjoined graph might mean rather more than the truth-functional operation of conjunction applied to the truths of the two graphs, because ****conjoining expressions with free variables changes the implied quantifier scopes****. All of this is perfectly understandable in conventional FOL syntax terms, and indeed corresponds exactly to what one would expect such a syntax to behave even if it were written out in linear predicate-logic form. Using our three-person family example, the conjunction of 
> 
> Child(John, x)
> 
> and
> 
> Child(Mary, x)
> 
> (same x, the same shared blank node) is, obviously, 
> 
> Child(John, x) & Child(Mary, x)
> 
> which is of course not adequately reflected by the entailments when we put the quantifiers in explicitly:
> 
> (exists (x)(Child(John, x))  &  (exists (x)(Child(Mary, x))  |=/=  (exists (x)( Child(John, x) & Child(Mary, x) ) 
> 
> because by doing that we effectively split the blank node into two distinct blank nodes, one in each quantifier scope. 
> 
> (At one point back in 2004, I recall saying that blank nodes were like existential variables bound by a quantifier at infinity, ie that the implied quantifier scopes were world-wide, like the Web. But that wording was abandoned because people found it too confusing. I wish now I had stuck to it.)
> 
> RDF 2004 effectively denied the validity of sharing blank nodes between graphs. It treated that phenomenon, if it ever occurred, as a mistake to be corrected by merging rather than unioning. But now we have explicitly allowed blank node sharing to occur, this sharing is presumably intended to be meaningful, and this way of making it meaningful is simple, obvious, and exactly in accord with the original RDF graph model intuitions. 
> 
> It also corresponds to how graphs are processed right now. If we transmit a graph by sending triples down a wire, we expect the graph to be reassembled as their union, not as their merge. But if Antoine's argument were right, this would be a semantic error. All shared blank nodes should be immediately un-shared, as the truth conditions on single-literal subgraphs would require this. Of course we realize that this would be silly, but a strict application of the 2004 truth conditions does require it. 
> 
> Pat
> 
> 
> 
> On Jun 18, 2013, at 10:36 PM, Peter F. Patel-Schneider wrote:
> 
>> 
>> On 06/18/2013 09:35 AM, Antoine Zimmermann wrote:
>>> Peter:
>>> 
>>> 
>>> Le 15/06/2013 20:30, Peter F. Patel-Schneider a écrit :
>>>> Another way of looking at this issue is as follows.
>>>> 
>>>> Take an RDF graph G, and then divide the triples in it into two subgraphs, G1 and G2.   The meaning of G is can be stronger than combining the separate meanings of G1 and G2.  This was true in 2004 and remains true now.
>>>> 
>>>> The simplest example of this (which is also in Pat's response) is:
>>>> 
>>>> Let b be a blank node,
>>>> let G be the graph with two triples ex:John ex:child b and ex:Mary ex:child b
>>>> let G1 be the graph with one triple ex:John ex:child b
>>>> let G2 be the graph with one triple ex:Mary ex:child b
>>> 
>>> 
>>> What you are saying is that bnodes denote. When you reuse a bnode, it must denote the same thing wherever it appear. That's not the semantics of bnodes.
>> 
>> I'm not saying that *at all*.  I'm just saying what was in the 2004 semantics.
>> 
>> When a graph includes a bnode, all occurences of that bnode must denote the same thing in any extended interpretation (I+E).  No change from 2004.  When two different graphs include the same bnode, there is no such extended interpretation between the two different graphs.  No change from 2004.  If the two graphs are unioned, then, and only then, are the bnode occurences put under the same "scope". Again no change from the 2004 semantics.
>>> 
>>> In FOL, if I use the same variable in two formulas, and these formulas have an existential quantifier before the variable, then the set of two formulas is equivalent to a formula with two different variables.
>> 
>> If you mean, Ex Px and Ex Qx for the two formulas, then sure.
>> 
>> The situation in RDF is no different from a reading in FOL where free variables are implicitly existentially quantified.
>> 
>>> You could argue that you are losing information by turning a set of formulas into a single formula, since you have lost the fact that the two were using the same variable, but that fact has no semantic value. It's knowledge about syntax that truth preserving operations do not have to preserve.
>>> 
>>> the triple:
>>> 
>>> ex:John ex:child b
>>> 
>>> is saying that John has a child.
>>> 
>>> the triple:
>>> 
>>> ex:Mary ex:child b
>>> 
>>> is saying that Mary has a child.
>> 
>> Sure.
>>> 
>>> How can this two pieces of information lead to the conclusion that Mary and John have a child together?  Especially since the information would be strictly the same if I had:
>>> 
>>> ex:John ex:child []
>>> 
>>> and
>>> 
>>> ex:Mary ex:child []
>> 
>> What's the [] in the above notation?
>>> 
>>> On the contrary, you pretend that bnodes have an identity, they are not exactly equivalent to any other bnode.
>> 
>> I'm not pretending anything.  Bnodes do have some sort of identity and some sort of distinctness.  To say otherwise scrambles just about everything about RDF
>>> In any case, it's not possible to have a compliant implementation of union. As soon as you realise union, you produce a serialisation that you can't guarantee to be more than isomorphic to the union.
>> Why not? There are claims that implementations have been doing precisely this for years.  Triple and quad stores build up and break down RDF graphs while maintaining a firm grip on the identity of the blank nodes in these graphs.
>> 
>>> However, it is very much possible to implement merge, with a slight and minimal modification to the 2004 definition.
>> 
>> The current version of semantics has a definition of merge.  Is there anything wrong with that?
>>> 
>>> 
>>> Anyway, if you do not see what's wrong, I'll stick to my objection.
>> 
>> I don't see anything wrong.  I don't see any change in this part of the formalism between 2004 and now.  The only change is to describe an operation on graphs that share blank nodes, one that is being used in practice.
>> 
>>> 
>>> 
>>> I will make a concrete proposal indicating the kind of text I would like to see.
>>> 
>>> 
>>> 
>>> 
>>> AZ.
>>> 
>> 
>> peter
>> 
>>>> 
>>>> In the separate meanings of  G1 and G2  John and Mary need not have the same child, so combining these separate meanings doesn't get you back to the meaning of G.
>>>> 
>>>> peter
>>>> 
>>>> 
>>>> On Jun 15, 2013, at 11:20 AM, Peter F. Patel-Schneider <pfpschneider@gmail.com> wrote:
>>>> 
>>>>> Here is my response on unions.  I have deliberately not included the previous discussion in this response.
>>>>> 
>>>>> 
>>>>> In 2004 it was assumed that having two RDF graphs sharing a blank node was a mistake.   One way to go would have been to say that it was an error to combine RDF graphs that shared blank nodes.  However, the notion of a merge was defined, I think mostly so that there was something to say about what to do in surface syntaxes.
>>>>> 
>>>>> It is already the case that RDF graphs share blank nodes, and the new version of RDF allows for this fact of life.
>>>>> 
>>>>> Now what to do when combining two RDF graphs that share blank nodes?  Well, what should happen?  It seems ludicrous to say that if you take part of an RDF graph and then combine it back with the graph itself that you get something different from the original graph, so merge doesn't seem to be a viable option.  So we are left with simple union.
>>>>> 
>>>>> So combining two RDF graphs that share blank nodes is no longer logically equivalent to conjunction.  So what?
>>>>> 
>>>>> So nothing!  If the two RDF graphs don't have some inherent connection then they really can't share blank nodes, so combination is conjunction.  If they do share blank nodes then they have some inherent connection and it should not be much of a surprise that their combination might not be conjunction.
>>>>> 
>>>>> peter
>>>>> 
>>>>> PS:  It should be possible to come up with a more-complex semantics that captures some stronger intuition about blank nodes, but there is then the distinct possibility of ruling out some existing or potential use of blank nodes.  Of course the way around this is to expand the expressive power of RDF (to, for example, include explicit existential quantification), but I'm pretty sure that no one wants to go there at this time.
>>>> 
>>>> 
>>>> 
>>> 
>> 
>> 
>> 
> 
> ------------------------------------------------------------
> IHMC                                     (850)434 8903 or (650)494 3973   
> 40 South Alcaniz St.           (850)202 4416   office
> Pensacola                            (850)202 4440   fax
> FL 32502                              (850)291 0667   mobile
> phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
> 
> 
> 
> 
> 
> 
> 

------------------------------------------------------------
IHMC                                     (850)434 8903 or (650)494 3973   
40 South Alcaniz St.           (850)202 4416   office
Pensacola                            (850)202 4440   fax
FL 32502                              (850)291 0667   mobile
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 19 June 2013 17:36:19 UTC