Re: Pragmatics of Blank Nodes Re: Toward easier RDF: a proposal from David Booth on 2018-12-05 (semantic-web@w3.org from December 2018)

From: David Booth <david@dbooth.org>
Date: Tue, 4 Dec 2018 22:55:13 -0500
To: semantic-web@w3.org, Pat Hayes <phayes@ihmc.us>
Message-ID: <1333a64c-bb55-6be1-afdd-47dce74dd79c@dbooth.org>
Hi Pat,

On 12/4/18 7:31 PM, Patrick J Hayes wrote:
> 
> 
>> On Dec 4, 2018, at 2:30 PM, David Booth <david@dbooth.org> wrote:
>>
>> On 12/3/18 8:29 AM, Henry Story wrote:
>>> . . .  So what are the advantages of blank nodes
>>> pragmatically? They make a description local to the graph
>>> in which they appear and this locality is maintained
>>> across merges. The meaning of URI referenced resources can
>>> be completed by external information of course but the
>>> description ensures that no further links need to be taken
>>> into account when understanding the bnode's meaning. So it
>>> looks like it's ideal for things that need to be entirely
>>> defined by description.
> 
> OR that cannot be *defined* at all, which is closer to the
> original idea. Henry, why would you assume that everything
> that can be mentioned, can also be /defined/?
>>
>> Interesting point!   That means that blank nodes enjoy a
>> form of closed world assumption (CWA),
>
> No. That is exactly the kind of mistake that one gets into
> by being too loose with words like 'define'.
>
>> in that there *cannot* be any other triples asserted
>> (directly) about a blank node, other than the ones already
>> in the document/graph/dataset at hand.  (Inference could
>> add some though.)
>
> Yes, it certainly could, if one has access to something
> like OWL.
>>
>> Of course, if we are dealing with implicit blank nodes --
>> the ones generated by [] or () notation in Turtle -- then
>> it's even more obvious that the only property connections
>> to/from that blank node are the ones provided right there
>
> Inference can add extra triples to those also.

Yes, of course.

> Suppose for example you know that the property rdf:rest is 
> funcitonal and you know that x:A rdf:rest _:x ., and someone
> tells you that
> 
> x:A rdf:rest _:y .
> _:y x:Q x:C .
> 
> then you know know that  _:x owl:sameAs _:y ., and hence that _:x x:Q x:C .
> 
> Now, someone might argue that such cases are vanishingly rare, or even that they shouldn’t be allowed or encouraged, but that would be a different argument.
> 
>>
>> This brings me to an interesting question.  To rephrase, the "identity" of a blank node object is determined entirely by the identities of its connected nodes, because it is guaranteed to not have any other connections.
> 
> It isn't, if we allow inferences.

Certainly we must allow inferences.  However, the results of inference 
constitute a different graph: the original graph + the entailments.

I put "identity" in quotes above because what I mean is the identify of 
that node *within* the graph, i.e., a name that allows us to distinguish 
that node from other nodes in the graph.  I am *not* referring to "all 
information known/knowable about that node", or "the properties of the 
node", or any other grand notion of identity like that.  I am talking 
about identity in the context of blank node labeling, in which the goal 
is to have a standard algorithm for labeling each blank node.

> 
>> Therefore, a blank node labeling algorithm (or standard
>> Skolemization algorithm) only needs to take into account the
>> subgraph of that blank node's tightly connected neighbors.
>> By "tightly connected" I mean the subgraph that is connected
>> only through consecutive blank nodes.  (I think this may
>> be slightly different from the Concise Bounded Description
>> (CBD), because the CBD starts only with the *subject*
>> of a triple.)  https://www.w3.org/Submission/CBD/
>> 
>> Aiden (or someone else), is this correct?  If so, this would
>> be very beneficial, because the labeling algorithm could
>> then be guaranteed to generate the *same* label (or Skolem
>> URI) for the blank nodes in that subgraph, regardless of any
>> larger graph in which that subgraph appears.  This is very
>> pertinent to n-ary relations, because it means that blank
>> nodes for the same n-ary relation, appearing in different
>> RDF graphs, could be automatically given the *same* label (or
>> Skolem URI) -- even without knowing a key for that object.
> 
> That would be a wildly invalid conclusion. The coding of 
> an n-ary atomic sentence into binary RDF basically says
> that an 'event' (or a 'fact', or 'situation', or)  exists
> which represents the fact of the relation holding between
> the participants. So my hitting a wall with a hammer (a
> three-place relation) might be encoded as a bnode of type
> hitting with an agent being me and an object being the wall
> and the means being the hammer. But there might be a whole
> lot of hits of that wall with that hammer by me. You can't
> infer that the many bnodes which encode various assertions
> of this kind are all the same single entity with a single
> global identifier: for one thing, that would imply that I
> only hit the wall once.

No, it would imply that you hit the wall at *least* once.
Asserting the same thing multiple times does *not* imply
that it happened more than once.  It is logically equivalent
to asserting it once, right?  So if these two statement groups
appear in a graph:

   [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] .
   [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] .

then they are logically equivalent to a single (lean) statement group:

   [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] .

and hence they can share the same blank node.  Correct?  And if that 
blank node is Skolemized, then they can share the same Skolem URI.  Correct?

David Booth
Received on Wednesday, 5 December 2018 03:55:36 UTC