Re: Editorial changes in Section 2.5 from Pat Hayes on 2006-01-30 (public-rdf-dawg@w3.org from January to March 2006)

From: Pat Hayes <phayes@ihmc.us>
Date: Mon, 30 Jan 2006 12:29:31 -0600
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>
Message-Id: <p06230900c003fc35a3d4@[192.168.2.2]>
>On 28 Jan 2006, at 07:09, Pat Hayes wrote:
>>>I take it that G is
>>>
>>>:a :p _:bbb .
>>>:a :q :e .
>>
>>OK, though G does not play any role here.
>>
>>>and that G' is
>>>
>>>:a :p _:b .
>>>:a :q :e .
>>
>>Yes
>>
>>>
>>>and that BGP is
>>>
>>>{?x :q _:b}
>>
>>Yes. Notice that they share a blank node.
>>
>>>
>>>and that BGP' is
>>>
>>>{?x :q _:bb}
>>
>>Yes
>>
>>>
>>>and that the instance S(BPG') is
>>>
>>>:a :q _:bb.
>>
>>Yes
>>
>>>
>>>and anyhow find that G simply entails (G' union S(BGP'))
>>>i.e.
>>>
>>>:a :p _:bbb.
>>>:a :q :e.
>>>
>>>simply entails
>>>
>>>:a :p _:b.
>>>:a :q :e.
>>>:a :q _:bb.
>>>
>>>no?
>>
>>Yes, exactly. But it should not, because in this case G does NOT 
>>simply entail (G' union S(BGP)).
>
>And why it should?

So that G should entail the CONSTRUCT graph, which is built out of 
copies of Si(BGP)

>Where is it written? Remember that in the old chracterisation G' is 
>required not to share any node with BGP, but in this example this is 
>not true since they share _:b.

Perhaps I missed this point. The new definition explicitly allows G' 
to share bnodes with BGP, but ensures that this sharing cannot 
influence the answer set. Is that right? So this extra freedom - 
which is illusory, since relaxing a constraint on a purely formal 
parameter like G' has no detectable effect - cannot be utilized by 
some hypothetical told-bnode-extension; so it seems to have no actual 
operational utility whatsoever, even to someone whose agenda might be 
to undermine a WG decision by stealth. So why make that change? It 
certainly makes the definition more complicated, it achieves nothing 
by way of defining anything better, and I am sure that it will at the 
least make the definition of CONSTRUCT more difficult, since that 
will now need to ensure explicitly that any accidental bnode clashes 
between BGP and the bnodes introduced into answer bindings from G' 
are eliminated. These were impossible with the previous definition, 
but (as my example was intending to show) are now possible once again.

>What is important is that {?x/:a} *is* a solution in both the old 
>and the latest characterisation - this is what we are characterising 
>in the definition of BGP E-matching. See my example below.
>
>>So in this case, that basic graph pattern should NOT match with 
>>that solution. This answer is correct for BGP' , but it is not 
>>correct for BGP. Which is what matters, since the definition is 
>>supposed to be defining a match for BGP.
>
>This is nonsense.

You are missing my point. No doubt this is my fault for not 
explaining it better. Of course in any one case, this definition will 
give the same answer bindings. But it allows scope leaks between G', 
the source of potential answer bnodes in answer bindings, and BGP 
itself. The example was supposed to indicate such a case. These are 
irrelevant for a single answer, but they become relevant when answers 
are combined.

<snip>

>
>>The point is that we introduced G' in order, partly, to be able to 
>>ensure that there were no 'accidental' bnode clashes between the G 
>>and the BGP (and in part to ensure that all the answer bindings 
>>used bnodes consistently with one another and with their pattern of 
>>useage in G.)
>
>Unluckily enough, in explaining your example you didn't take care of 
>this important aspect.
>
>>The original phrasing, in which we simply said that (G entails (G 
>>union S(BGP))), was wrong, as Enrico noted, because G and BGP might 
>>accidentally share bnodes, so there is a need to standardize them 
>>apart.
>>FUB suggested the directed merge trick to fix this; and then we 
>>introduced the scoping graph G', and noticed that since G' could be 
>>stipulated by definition to be standardized apart from all the 
>>BGPs, there was no need to use a directed merge, because now a 
>>simple union would do.
>
>Exactly. And also with our latest characterisation, we make sure 
>that there are no clashes by renaming the clashing existential 
>variables (bnodes) in BGP - and we call BGP' this renamed BGP.

The point of my example was to show that this does not make sure that 
are no clashes between G' and BGP itself; and this will be important 
when one tries to put together several answers into one document, as 
with CONSTRUCT.

But perhaps the more important pint is that there was no absolutely 
no need to introduce BGP' to ensure that there are no bnode clashes 
between BGP and with G': that is already guaranteed by the definition 
of G' itself. By imposing the separation condition between G' and 
BGP' rather than between G' and BGP itself, you have, ironically, 
removed the useful force of this condition, by permitting G' to have 
real bnode clashes with BGP itself. There was no problem there to be 
solved, but this solution to a nonexistent problem has broken the 
mechanism that supported a useful definition of CONSTRUCT, which of 
course refers to the result of applying an answer binding to BGP.

>We can legally do so since bnodes in BGP are scoped only locally to 
>BGP, and since BGP' differ from BGP only beacuse there may be 
>different bnode names. BGP and BGP' are SPARQL-equivalent since they 
>give rise always to the same answers (in the old characterisation as 
>well). So, once I take an appropriate BGP', and I make sure that 
>BGP' does not share any bnode name with G', I satisfy the same 
>conditions as in the old characterisation.

That is true for this local answer definition. It doesn't show that 
the complication was necessary, and indeed it isn't, but Ive already 
made that point. But now look at how the CONSTRUCT graph is defined. 
At this point, we have to take the answer bindings and apply them to 
variants of BGP. What prevents these from sharing bnodes with answer 
bindings from G'? After all, BGP can. If A is a legal answer binding, 
on this definition, it does not even follow that G entails (G' union 
A(BGP)). But in a simple case, A(BGP) might actually be the entire 
CONSTRUCT graph.

>By the way, this is an informal proof of the correctness of the 
>latest characterisation.
>
>>But by requiring the bnode-separation condition to hold not between 
>>G' and BGP (as it should), but between G' and some variant BGP' of 
>>BGP, we have rendered this condition vacuous, since BGP itself can 
>>now share bnodes with G', which is exactly the situation we set up 
>>all this machinery to avoid in the first place.
>>
>>I did not misread the definition: I suggest that y'all read through 
>>the above carefully, and think about it.
>
>We did it carefully. It is easy to see that the latest 
>characterisation is correct, see above.
>Convinced?

No. But I am getting tired of arguing with you. Go ahead and insert 
in the SPARQL spec what is likely to be the most inelegant, 
incomprehensible and irrelevant definition yet included in any W3C 
document. I am fairly sure it is also, in its present form, subtly 
wrong; but frankly, at this stage, whether it is wrong or right is of 
relatively minor importance.

<owl:Class rdf:ID="implementers">
    <owl:Restriction>
      <owl:onProperty rdf:resource=":hasOrWillImplement" />
      <owl:minCardinality rdf:datatype="number">1</owl:minCardinality>
      <owl:allValuesFrom rdf:resource=":SPARQLimplementations" />
    </owl:Restriction>
    <owl:disjointWith>
        <owl:Restriction>
          <owl:onProperty rdf:resource=":reads" />
          <owl:hasValue rdf:resource=":theDefinition" />
        </owl:Restriction>
    </owl:disjointWith>
</owl:Class>

Pat


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 30 January 2006 18:29:49 UTC