- From: Henry Story <henry.story@bblfish.net>
- Date: Thu, 6 Dec 2018 11:53:32 +0100
- To: Andy Seaborne <andy@seaborne.org>
- Cc: Semantic Web <semantic-web@w3.org>
> On 5 Dec 2018, at 19:28, Andy Seaborne <andy@seaborne.org> wrote: > > > > On 05/12/2018 04:13, Patrick J Hayes wrote: >>> On Dec 4, 2018, at 9:55 PM, David Booth <david@dbooth.org> wrote: >>> >>> Hi Pat, >>> >>> On 12/4/18 7:31 PM, Patrick J Hayes wrote: >>>>> On Dec 4, 2018, at 2:30 PM, David Booth <david@dbooth.org> wrote: >>>>> >>>>> On 12/3/18 8:29 AM, Henry Story wrote: >>>>>> . . . So what are the advantages of blank nodes >>>>>> pragmatically? They make a description local to the graph >>>>>> in which they appear and this locality is maintained >>>>>> across merges. The meaning of URI referenced resources can >>>>>> be completed by external information of course but the >>>>>> description ensures that no further links need to be taken >>>>>> into account when understanding the bnode's meaning. So it >>>>>> looks like it's ideal for things that need to be entirely >>>>>> defined by description. >>>> OR that cannot be *defined* at all, which is closer to the >>>> original idea. Henry, why would you assume that everything >>>> that can be mentioned, can also be /defined/? >>>>> >>>>> Interesting point! That means that blank nodes enjoy a >>>>> form of closed world assumption (CWA), >>>> >>>> No. That is exactly the kind of mistake that one gets into >>>> by being too loose with words like 'define'. >>>> >>>>> in that there *cannot* be any other triples asserted >>>>> (directly) about a blank node, other than the ones already >>>>> in the document/graph/dataset at hand. (Inference could >>>>> add some though.) >>>> >>>> Yes, it certainly could, if one has access to something >>>> like OWL. >>>>> >>>>> Of course, if we are dealing with implicit blank nodes -- >>>>> the ones generated by [] or () notation in Turtle -- then >>>>> it's even more obvious that the only property connections >>>>> to/from that blank node are the ones provided right there >>>> >>>> Inference can add extra triples to those also. >>> >>> Yes, of course. >>> >>>> Suppose for example you know that the property rdf:rest is funcitonal and you know that x:A rdf:rest _:x ., and someone >>>> tells you that >>>> x:A rdf:rest _:y . >>>> _:y x:Q x:C . >>>> then you know know that _:x owl:sameAs _:y ., and hence that _:x x:Q x:C . >>>> Now, someone might argue that such cases are vanishingly rare, or even that they shouldn’t be allowed or encouraged, but that would be a different argument. >>>>> >>>>> This brings me to an interesting question. To rephrase, the "identity" of a blank node object is determined entirely by the identities of its connected nodes, because it is guaranteed to not have any other connections. >>>> It isn't, if we allow inferences. >>> >>> Certainly we must allow inferences. However, the results of inference constitute a different graph: the original graph + the entailments. >>> >>> I put "identity" in quotes above because what I mean is the identify of that node *within* the graph, i.e., a name that allows us to distinguish that node from other nodes in the graph. I am *not* referring to "all information known/knowable about that node", or "the properties of the node", or any other grand notion of identity like that. I am talking about identity in the context of blank node labeling, in which the goal is to have a standard algorithm for labeling each blank node. >>> >>>>> Therefore, a blank node labeling algorithm (or standard >>>>> Skolemization algorithm) only needs to take into account the >>>>> subgraph of that blank node's tightly connected neighbors. >>>>> By "tightly connected" I mean the subgraph that is connected >>>>> only through consecutive blank nodes. (I think this may >>>>> be slightly different from the Concise Bounded Description >>>>> (CBD), because the CBD starts only with the *subject* >>>>> of a triple.) https://www.w3.org/Submission/CBD/ >>>>> Aiden (or someone else), is this correct? If so, this would >>>>> be very beneficial, because the labeling algorithm could >>>>> then be guaranteed to generate the *same* label (or Skolem >>>>> URI) for the blank nodes in that subgraph, regardless of any >>>>> larger graph in which that subgraph appears. This is very >>>>> pertinent to n-ary relations, because it means that blank >>>>> nodes for the same n-ary relation, appearing in different >>>>> RDF graphs, could be automatically given the *same* label (or >>>>> Skolem URI) -- even without knowing a key for that object. >>>> That would be a wildly invalid conclusion. The coding of an n-ary atomic sentence into binary RDF basically says >>>> that an 'event' (or a 'fact', or 'situation', or) exists >>>> which represents the fact of the relation holding between >>>> the participants. So my hitting a wall with a hammer (a >>>> three-place relation) might be encoded as a bnode of type >>>> hitting with an agent being me and an object being the wall >>>> and the means being the hammer. But there might be a whole >>>> lot of hits of that wall with that hammer by me. You can't >>>> infer that the many bnodes which encode various assertions >>>> of this kind are all the same single entity with a single >>>> global identifier: for one thing, that would imply that I >>>> only hit the wall once. >>> >>> No, it would imply that you hit the wall at *least* once. >>> Asserting the same thing multiple times does *not* imply >>> that it happened more than once. It is logically equivalent >>> to asserting it once, right? So if these two statement groups >>> appear in a graph: >>> >>> [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] . >>> [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] . >>> >>> then they are logically equivalent to a single (lean) statement group: >>> >>> [ a :Hit ; :by :hammer ; :agent :pat ; :target :wall ] . >>> >>> and hence they can share the same blank node. Correct? > > lean graphs are all very well until update happens. New information arrives that breaks the equivalence. > > For an "easier RDF", talking about how the graph is built seems quite natural. > > Leaning has a place at the point of publishing (maybe). Why could not the RDF library implement bondes as a triple type BNode = GraphID × LocalNodeId × Lean which could of course be done efficiently with type GraphId=Long type LocalNodeId=Int or Long type Lean=Boolean where Lean would be a flag that the node was calculated as lean as described as I understand it by the algorithms detailed in "Everything you always wanted to know about blank nodes" https://www.sciencedirect.com/science/article/pii/S1570826814000481 ? > >> Yes, you are absolutely right. And I was wrong, above. (I bow graciously and remove my hat.) Though if you have both copies, which is what ‘share’ suggests, then your graph is still non-lean. It would be better to just keep one copy, and have a lean graph. >>> And if that blank node is Skolemized, then they can share the same Skolem URI. Correct? >> Yes, with same comment about ’share’. >> Pat >>> >>> David Booth
Received on Thursday, 6 December 2018 10:53:59 UTC