W3C home > Mailing lists > Public > public-rdf-dawg@w3.org > January to March 2006

Re: Wording change (was : Re: Final text for Basic Graph Patterns)

From: Pat Hayes <phayes@ihmc.us>
Date: Thu, 19 Jan 2006 12:32:05 -0600
Message-Id: <p06230902bff57e0966c1@[10.100.0.23]>
To: Enrico Franconi <franconi@inf.unibz.it>
Cc: RDF Data Access Working Group <public-rdf-dawg@w3.org>

>On 19 Jan 2006, at 00:15, Pat Hayes wrote:
>>Anyway, see what y'all think of that. We have until next Tuesday to 
>>argue about it :-)
>
>As it is, it does not work. Now, please, allow me the following. I am serious.
>
><rant>
>To be honest, you can not pretend to change your mind every few 
>days, throw at us super-verbose sloppy definitions, have me read 
>them, enter in your head, understand, and then I have to say that 
>they do not work. It's going on this way since three months, and I 
>simply can't spend too much time with you, OK?

Yes, I sympathize, and apologize again for not thinking of this 
simplification sooner. It isnt a change of mind, but arose from an 
attempt to dream up an explanation of why your definition was couched 
in the strange way that it is couched. (BTW, I have the greatest 
admiration for the ingenuity of that definition. But its hard to 
grasp its full implications and operational requirements at a first 
reading, or even at about the fifth reading.)

>It's almost 3 in the night, I spent almost one hour to answer, and I 
>am tired, I already fixed your yesterday's version this morning.

And I responded to your suggestions.

>Do you want to see me dead?

I had better not reply to that THAT question:-). But seriously, WG 
work is like this. One has to chew at the details over and over again 
to get things as tidy as possible in the final document. And for my 
part, I have made repeated attempts (also using up evenings when I 
have to earn my living in the daytime :-) to draft some prose 
explaining to a potential reader WHY your definition works, and why 
more apparently intuitive versions of it don't work, and the result 
is never very clear or persuasive, since there is no easy way to 
explain why one needs the 'inner' graph or why the mapping to the 
variables in BGP needs to be applied to the entire graph (and of 
course it doesn't: this is just an artifact of the mathematical 
notation.)

></rant>
>
>The rant is to let you know about my inability to work in this way, 
>and it is not intending to judge your behaviour.

Sorry Im causing you grief. Think of me as the chrome on the front of 
the fender, and bear in mind that it's the truck that you really need 
to worry about. You might finish up spending the next 3 years giving 
long email explanations to implementors of exactly why their code 
doesn't quite conform to your mathematics.

>>that we have some leeway with specifying the scoping graph which we 
>>can utilize, in particular we can require that the scoping graph be 
>>standardized apart from the BGPs. So suppose it is. Then we can 
>>replace
>>
>>S(G' OrderedMerge BGP)
>>
>>by
>>
>>(G' union S(BGP))
>>
>>in the definitions, which is clearer and more intuitive.
>
>You can't have the told-bnode case anymore with your wording.

We can without changing the actual formal definition, see below; but 
in any case, the WG has taken a clear decision to reject told bnodes. 
That decision does indeed simplify the necessary mechanisms for 
handling bnode scopes, and I think it makes sense to utilize this 
where we can. But read on.

>Since it's so late in the night, I leave the easy proof of this as 
>an exercise to the students.
>Hint: the told-bnode case is about coreferencing bnodes in the query 
>with bnodes in the graph; without a formal function relating the two 
>(in our case, the OrderedMerge, where you can fix some of the 
>renamings - see our 2nd November document) you can not express 
>formally this relationship. I am sure that you understand the 
>scoping graph to avoid the merge, but not having the merge anymore 
>does not allow you to define which bnodes should not be renamed, 
>when they play the role of told bnodes in the query.

If you look carefully, you will see that no merging takes place at 
all: there is a simple union. We understand that S does not apply to 
bnodes, only to query variables. Hence, no bnodes are renamed by any 
part of this definition. Issues of keeping bnode scopes straight are 
handled by the surrounding text and ancillary definitions, on 
co-occurrences of bnodes between G' and BGP, which we require to be 
standardized apart. Right now, that could be done either by keeping 
G' clear of the bnode vocabulary from Q, or by modifying the query 
bnodes. But of course if one wanted to keep G' fixed, then it would 
have to be done by tweaking Q. Now, we could weaken this condition to 
a [should] rather than a [must], and further remark that *if* the 
same G' were used for a number of queries - for example, if G' were G 
- then any bnodeID from G' which was re-used in a later query BGP 
*which was not standardized apart from G'* would be in the scope of 
G' (because of the union), and so would be identified with the 
original bnode in the scoping graph. This is exactly a told bnode. We 
could say that servers [should not] do this and [must] declare it if 
they do, or some such cautionary wording. We could call it a 'scoping 
leak'. If we use the [RFC2119] terminology carefully this would allow 
folk to use SPARQL with all-told-bnodes in a close-knit transaction 
without actually being illegal. It doesn't provide for distinguishing 
told from plain bnodes in the language or protocol, but we have never 
seriously considered that option.

>On the other hand, the only role for which I find the scoping graph 
>useful is because you want to distinguish the told-bnode case with 
>the one where bnodes can be arbitrary.

Its chief utility for me is that it keeps the definitions clearer and 
easier to understand, by drawing attention to the reason for 
including a G-clone in the consequent, and by separating out 
entailment from scoping issues.

>Of course, the equivalence theorem with the subgraph matching 
>implementation (done by most system, and standard in the LC design) 
>holds only in the case of the told-bnode (i.e., when G=G'). Ah.
>
>Is it fixable? Who knows, probably yes, with some other lenghty 
>proposal you may throw at us.

See above. I honestly do not think this is either lengthy or hard to 
state precisely, or to understand. It comes up naturally as a kind of 
side observation in the text that justifies why BGP and G' should be 
standardized apart. It is entirely about allowing bnodes in G' and 
BGP to be in a shared scope, or not; and this is exactly how it would 
be stated. It isn't done by an elegant mathematical conjuring trick 
(which, to repeat, as one definition-hacker to another, I actually 
quite admire, but find hard to explain.)

>And by the way, I wouldn't call this only a wording change ;-)

Well, if you count mathematics as wording it is :-). But it is not 
what Dan C. would call a 'substantive' change: it doesn't affect any 
test cases.

Pat

-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Thursday, 19 January 2006 18:32:24 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 16:15:25 GMT