Bnodes redux. (was: Re: [TF:DbE] The easiest keys there are) from Pat Hayes on 2007-10-03 (public-owl-dev@w3.org from October to December 2007)

From: Pat Hayes <phayes@ihmc.us>
Date: Wed, 3 Oct 2007 16:12:39 -0500
To: Bijan Parsia <bparsia@cs.man.ac.uk>
Cc: Owl Dev <public-owl-dev@w3.org>, "Obrst, Leo J." <lobrst@mitre.org>
Message-Id: <p0623090dc329a8e5658d@[10.100.0.30]>
>Leo,
>
>If people want to pursue this, I suggest relocating under a different thread.

Done, above.

>I'll tie this back to keys as a wrap up.
>
>On 3 Oct 2007, at 03:17, Obrst, Leo J. wrote:
>
>>Ah, I never realized this, but I think you are quite right:
>>
>>"... then all bnodes are existentially bound by quantifiers "at
>>infinity",
>>i.e. with a scope which effectively extends over the entire Web.
>>Which makes them indistinguishable from names, semantically."
>
>If this is true (and relevant), then there is no harm in changing 
>the semantics to one I favor (i.e., that they *are* names). If this 
>were true (*and* relevant), why not simply accommodate my quirkiness?

Well, I'm happy, but you might not be. Semantically they are like 
names, but syntactically they are very odd names indeed. You can't 
take a bnode in one graph and put it, or copy it, into a different 
graph; and you can't, strictly speaking, bind to it: you can only 
bind to a BNodeID, which really only tells you that an appropriate 
Bnode is there in the graph somewhere. Even if I agree to let your 
BnodeID scope extend into my graph descriptions (which I probably 
won't, if I have any sense), this amounts to giving you a pointer to 
the bnode in my graph. The thing it points to, the actual node, is in 
my graph, and it stays there. You can't copy it, or do anything to 
it, other than say it exists (which amounts to using YOUR own 
existential) or maybe, if I am feeling very generous, keep a pointer 
into my graph that lets you look at that node and explore around it.

>In point of fact, we don't work with the quantifiers at infinity. If 
>you want to evaluate a query, for example, the variables in the 
>query are distinct from the variables in the dataset, and you often 
>(always?) scope the dataset.

The query variables have local scopes, of course. I was only 
referring to bnodes.

>If this were relevant, then I would have no worries. BNodes wouldn't 
>make reasoning (and specification) so much harder. No one would or 
>should care about using skolem constants, for example, since it 
>wouldn't be detectable.

The only relevant difference is that URIs are understood to actually 
be identifiers, whereas Bnodes are, well, *blank*. The only question 
you can ask about a bnode is, where is it located? You never ask of 
two bnodes, are these the same bnode? They can't possibly be, if 
there are two of them. Contrast names, where you can take two 
occurrences (tokens) of names and sensibly ask, are these the same 
name?

>There are always circumstances wherein you can detect a feature or a 
>feature of a logic has impact and circumstances where it does not. 
>In point of fact, BNode semantics have distorted such simple logics 
>as RDFS (i.e., certain obvious entailments don't hold because of 
>accomodations made for Bnodes...domain and range inheritance come to 
>mind

? What exactly are you referring to here? I'm not aware of any 
bnode//domain+range interactions. Domains and ranges are messy 
because the languages are monotonic (and RDFS has no negation or 
complement).

>; plus, bnodes can make query answering much harder esp. in RDFS 
>circumstances).

Indeed.

>
>If no one minds that we treat:
>
>	s p _:x.
>
>radically differently from:
>	s rdf:type [a owlRestriction;
>		onProprety p;
>		someValuesFrom owl:Thing].

Of course they are radically DIFFERENT. Just LOOK at them. Did you 
mean, as not meaning the same thing? Because that is a very different 
thing to say.

>(And, to be clear, we're treating them differently when considered 
>as separate, closed graphs and testing for logical equivalence.

Ah, well, if you insist on having logical entailment incorporated 
into querying, then yes you will have problems. But this isn't 
particularly a bnode issue: it arises whenever your logic is 
nontrivial (pretty much anything past positive propositional logic 
without disjunction.) Bnodes get the bad rap in these forums because 
they are the first place in the RDF world where this happens. My own 
view is that we should treat querying as basically a dirty pragmatic 
syntactic pattern-matching business, and add the logical power we 
want as a separate layer. This doesn't solve any real hard problems, 
of course, but at least it keeps the issues separate, and it allows 
people who want to do quick hacks to get on with their lives in the 
real real world.

>  The fact that I have to make such qualifications speaks *volumes* 
>about this debate.) Then, frankly, I don't care *what* you call the 
>bnode semantics.
>
>Similarly, if we allow bnode to be bound to the variables of DL Safe 
>rules, we need to explain why there is *no* such binding in certain 
>equivalent cases (see above).

Because there is nothing to bind to in the second case. (Duh.) IMO 
what we all need to do is, stop pretending that logically equivalent 
expressions should be functionally indistinguishable. Treating 
logical equivalence as little more than a surface normalization is 
fine as long as your logic is so trivial (eg RDF without bnodes) that 
its true. But its emphatically not true for most logics; and then, 
two expressions (or queries) can be logically equivalent yet 
functionally very, very non-equivalent. Put another way, it really 
does matter how you express yourself. Put another way, logical 
equivalence is nowhere near the free ride that calling it "... 
indistinguishable" seems to imply it ought to be.

>In order for the DL Safe rule semantics for keys to work for the 
>very common foaf case, we need to explain why we can bind to _:x 
>(or to []) but not to anything coming from owl existential 
>quantifiers

If OWL provides an actual name/variable/term/Bnode to bind TO, then 
we would have something to explain. Until then, however, the 
explanation is trivial (and purely syntactic.)

>*and* to explain why this doesn't affect the various desirable 
>properties of DL Safe rules. If the answer is "Because they are 
>indistinguishable from 'local' names"

I still don't know what you mean by a "local name". Local implies a 
scope: but that only makes sense (*) if there is a binder to define 
that scope, so why aren't you calling this a "local variable" ?

I must be going wrong at (*) above, but I don't see where.

>then we must show that this is *in fact* the case in the 
>circumstances that we face. Frankly, it's a non-trivial story given 
>the current semantics since the above two cases are semantically 
>indistinguishable.

So what? OF COURSE there will be "semantically indistinguishable" 
(misleading terminology) cases which are syntactically and 
functionally distinguishable. There are logical truths that are 
"logically indistinguishable" from T but have arbitrarily high 
computational complexity.

>So this ripples throughout everything we do, making us have to do 
>lots of work that is really unnecessary. What's the benefit? If they 
>act like local names, let's make that *clear* by using a semantics 
>that aligns exactly with how they act.

You are looking under the wrong lamppost, I think. What makes bnodes 
"bound at infinity" isn't their semantics, but their (lack of) 
syntax. They don't have any provision for having multiple copies of 
themselves: they can't co-occur; they are unique and singular, each 
distinct from all the others. There is no type/token distinction for 
bnodes. If two RDF graphs are disconnected from one another, then 
they cannot possibly both contain the same bnode. So the notion of 
'scope' that is so important for conventional bound names, and which 
goes along with distinctions like local/global, simply does not arise 
for bnodes. They are like names than can only ever be used once, and 
are architecturally guaranteed to be distinct from one another. The 
ultimate gensym, if you like, which refuses to allow itself to be 
reproduced or copied.

None of this applies to BnodeIDs, of course, but rather to the bnodes 
themselves. But the RDF semantics doesn't mention BnodeIDs.

Pat

>If you use a semantics that is *stronger* than how they act in some 
>(paradigm cases) then you are telling people who are building 
>extensions that they should respect that semantics in circumstances 
>where they are distinguishable.
>
>Cheers,
>Bijan.


-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Wednesday, 3 October 2007 21:13:01 UTC