Blank nodes, leaning, and the OWA from Gregg Reynolds on 2011-03-27 (semantic-web@w3.org from March 2011)

From: Gregg Reynolds <dev@mobileink.com>
Date: Sun, 27 Mar 2011 00:13:31 -0500
To: SW-forum Web <semantic-web@w3.org>
Message-ID: <AANLkTik1g5fjJchxUuGofjo_mh5mfK6FV1x_L3cK=khF@mail.gmail.com>
I'm having trouble reconciling RDF's handling of blank nodes and the Open
World Assumption.  I suppose I'm still not entirely grokking leaning and/or
OWA.  I searched the archives and didn't find anything addressing my
questions.  My reasoning follows; where are the flaws?

As I understand the OWA, if we have a node, all we know is that we have a
node; we do not know what properties it may have.  If we also know that it
has property A, we cannot infer that it does not also have property B for
any B.  Followed to its logical conclusion, this line of reasoning leads to
the conclusion that there can only ever be one blank node in any graph.

For example, suppose

[1] a)  <ex:Pedro ex:owns _:x>, <_:x rdf:type ex:Donkey>, <_:x ex:name
ex:Daisy>
     b)  <ex:Pedro ex:owns _:y>, <_:y rdf:type ex:Donkey>, <_:y ex:name
ex:Maisy>

then we've made two assertions (ok, six), but we have not necessarily
asserted that Pedro owns two donkeys.  By the OWA we cannot infer _:x !=
_:y.  It could be that Pedro owns two donkeys, named Daisy and Maisy,
respectively, but it also could be that Pedro owns one donkey with two
names.

Is the graph of [1] lean?  It seems to me that under the OWA the answer must
be that we do not know, just as we don't know if Pedro owns one or two
donkeys.  Under RDF semantics the answer could be yes or no, depending on
which model we choose.  But the principle of leaning as written (if I
understand it) compels us to treat it as non-lean, since a model with a
single node named both Daisy and Maisy works for both [1] a) and [1] b); the
leaned version would look like:

[2]  <ex:Pedro ex:owns _:x>, <_:x rdf:type ex:Donkey>, <_:x ex:name
ex:Daisy>, <_:x ex:name ex:Maisy>

Similar considerations would apply to all blank node IDs in a graph: a model
with a single (blank) node (with appropriate properties) would work for each
such blank node ID.

If this is not the case -- if graphs like [1] are construed as lean, with
_:x != _:y -- then it looks to me like leaning involves an implicit Closed
World Assumption.  I.e. if _:x is named Daisy it is not also named Maisy.

Now consider

[3] <ex:Pedro ex:owns _:x>, <ex:Pedro ex:owns _:y>

According the RDF semantics [1] is not lean, so it can be reduced to a
single triple <ex:Pedro ex:owns _:z>.  That's because the model theoretic
semantics mean that a single node can satisfy both clauses of [1].  But
under OWA, we cannot infer that no properties have been asserted of _:x and
_:y; which must mean that the node satisfying [1] can have any properties or
no properties.  The same principle must apply wherever blank node IDs occur,
which again leads to the conclusion that all blank node IDs in a graph can
be collapsed as it were to a single node.  In other words, it looks to me
like the principle of leaning, if valid, implies a maximum of one blank node
per graph.

I'm afraid my language is a little awkward but I hope you can see what I
mean.

A related question:  RDF Semantics says this is lean:

[4]  <ex:a> <ex:p> _:x .
       _:x <ex:p> _:x .

But <ex:a ex:p ex:a> seems to fit the definition of proper subgraph as used
to define "lean", and semantically to satisfy [4], so [4] would not be lean.

Thanks,

Gregg
Received on Sunday, 27 March 2011 05:14:05 UTC