Re: Pragmatics of Blank Nodes Re: Toward easier RDF: a proposal

> On Dec 4, 2018, at 9:55 PM, David Booth <david@dbooth.org> wrote:
> 
> Hi Pat,
> 
> On 12/4/18 7:31 PM, Patrick J Hayes wrote:
>>> On Dec 4, 2018, at 2:30 PM, David Booth <david@dbooth.org> wrote:
>>> 
>>> On 12/3/18 8:29 AM, Henry Story wrote:
>>>> . . .  So what are the advantages of blank nodes
>>>> pragmatically? They make a description local to the graph
>>>> in which they appear and this locality is maintained
>>>> across merges. The meaning of URI referenced resources can
>>>> be completed by external information of course but the
>>>> description ensures that no further links need to be taken
>>>> into account when understanding the bnode's meaning. So it
>>>> looks like it's ideal for things that need to be entirely
>>>> defined by description.
>> OR that cannot be *defined* at all, which is closer to the
>> original idea. Henry, why would you assume that everything
>> that can be mentioned, can also be /defined/?
>>> 
>>> Interesting point!   That means that blank nodes enjoy a
>>> form of closed world assumption (CWA),
>> 
>> No. That is exactly the kind of mistake that one gets into
>> by being too loose with words like 'define'.
>> 
>>> in that there *cannot* be any other triples asserted
>>> (directly) about a blank node, other than the ones already
>>> in the document/graph/dataset at hand.  (Inference could
>>> add some though.)
>> 
>> Yes, it certainly could, if one has access to something
>> like OWL.
>>> 
>>> Of course, if we are dealing with implicit blank nodes --
>>> the ones generated by [] or () notation in Turtle -- then
>>> it's even more obvious that the only property connections
>>> to/from that blank node are the ones provided right there
>> 
>> Inference can add extra triples to those also.
> 
> Yes, of course.
> 
>> Suppose for example you know that the property rdf:rest is funcitonal and you know that x:A rdf:rest _:x ., and someone
>> tells you that
>> x:A rdf:rest _:y .
>> _:y x:Q x:C .
>> then you know know that  _:x owl:sameAs _:y ., and hence that _:x x:Q x:C .
>> Now, someone might argue that such cases are vanishingly rare, or even that they shouldn’t be allowed or encouraged, but that would be a different argument.
>>> 
>>> This brings me to an interesting question.  To rephrase, the "identity" of a blank node object is determined entirely by the identities of its connected nodes, because it is guaranteed to not have any other connections.
>> It isn't, if we allow inferences.
> 
> Certainly we must allow inferences.  However, the results of inference constitute a different graph: the original graph + the entailments.
> 
> I put "identity" in quotes above because what I mean is the identify of that node *within* the graph, i.e., a name that allows us to distinguish that node from other nodes in the graph.  I am *not* referring to "all information known/knowable about that node", or "the properties of the node", or any other grand notion of identity like that.  I am talking about identity in the context of blank node labeling, in which the goal is to have a standard algorithm for labeling each blank node.
> 
>>> Therefore, a blank node labeling algorithm (or standard
>>> Skolemization algorithm) only needs to take into account the
>>> subgraph of that blank node's tightly connected neighbors.
>>> By "tightly connected" I mean the subgraph that is connected
>>> only through consecutive blank nodes.  (I think this may
>>> be slightly different from the Concise Bounded Description
>>> (CBD), because the CBD starts only with the *subject*
>>> of a triple.)  https://www.w3.org/Submission/CBD/
>>> Aiden (or someone else), is this correct?  If so, this would
>>> be very beneficial, because the labeling algorithm could
>>> then be guaranteed to generate the *same* label (or Skolem
>>> URI) for the blank nodes in that subgraph, regardless of any
>>> larger graph in which that subgraph appears.  This is very
>>> pertinent to n-ary relations, because it means that blank
>>> nodes for the same n-ary relation, appearing in different
>>> RDF graphs, could be automatically given the *same* label (or
>>> Skolem URI) -- even without knowing a key for that object.
>> That would be a wildly invalid conclusion. The coding of an n-ary atomic sentence into binary RDF basically says
>> that an 'event' (or a 'fact', or 'situation', or)  exists
>> which represents the fact of the relation holding between
>> the participants. So my hitting a wall with a hammer (a
>> three-place relation) might be encoded as a bnode of type
>> hitting with an agent being me and an object being the wall
>> and the means being the hammer. But there might be a whole
>> lot of hits of that wall with that hammer by me. You can't
>> infer that the many bnodes which encode various assertions
>> of this kind are all the same single entity with a single
>> global identifier: for one thing, that would imply that I
>> only hit the wall once.
> 
> No, it would imply that you hit the wall at *least* once.
> Asserting the same thing multiple times does *not* imply
> that it happened more than once.  It is logically equivalent
> to asserting it once, right?  So if these two statement groups
> appear in a graph:
> 
>  [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] .
>  [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] .
> 
> then they are logically equivalent to a single (lean) statement group:
> 
>  [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] .
> 
> and hence they can share the same blank node.  Correct?

Yes, you are absolutely right. And I was wrong, above. (I bow graciously and remove my hat.)  Though if you have both copies, which is what ‘share’ suggests, then your graph is still non-lean. It would be better to just keep one copy, and have a lean graph. 

>  And if that blank node is Skolemized, then they can share the same Skolem URI.  Correct?

Yes, with same comment about ’share’.

Pat

> 
> David Booth
> 

Received on Wednesday, 5 December 2018 04:14:22 UTC