Re: Blank nodes must DIE! [ was Re: Blank nodes semantics - existential variables?]

Hi Anthony,

On 2020-07-02 17:56, Anthony Moretti wrote:
> Hi Aidan
> 
> I think you'll end up with false negatives that way though. I think 
> comparison operations for value types need to be type dependent.
> 
> Example 1:
> 
>      id: _:fraction1
>      type: Fraction
>      numerator: 1
>      denominator: 2
> 
>      id: _:fraction2
>      type: Fraction
>      numerator: 2
>      denominator: 4
> 
>      The algorithm will return different IDs, but they're the same /value/.
> 
> Example 2:
> 
>      id: _:fraction1
>      type: Fraction
>      numerator: 1
>      denominator: 2
>      batteryPercentageOf: laptop1
> 
>      id: _:fraction2
>      type: Fraction
>      numerator: 1
>      denominator: 2
>      batteryPercentageOf: laptop2
> 
>      Again, the algorithm will return different IDs, but they're the 
> same value.
> 
> Something that I think might assist in this area would be if mainstream 
> value types had accompanying comparison operations.

I agree regarding Example 1. In Example 2, I think that _:fraction1 and 
_:fraction2 are different things (they are readings for different 
laptops; I would not say, for example, that two people are the same 
because they share the same date of birth).

I think the general problem you refer to resides at a different level 
and not really related to blank nodes. Note that if you use IRIs:

 id: :fraction1
 type: Fraction
 numerator: 1
 denominator: 2

 id: :fraction2
 type: Fraction
 numerator: 2
 denominator: 4

You end up with the same issue of :fraction1 and :fraction2 being in 
some sense related, arguably owl:sameAs, but not being recognised as 
such "automatically". There is no way to resolve this at the RDF level, 
and nor, I believe, should there be, as it would over-encumber RDF. 
Someone, somewhere, has to either (1) define what makes two things "the 
same value", or (2) provide lots of examples of things that are "the 
same value" over which supervised learning can be applied (or (3) 
perhaps both). There is lots of machinery for (1) in OWL, for example, 
though your precise example would not be covered as arithmetic is limited.

Best,
Aidan


> 
> On Thu, Jul 2, 2020 at 10:37 AM Aidan Hogan <aidhog@gmail.com 
> <mailto:aidhog@gmail.com>> wrote:
> 
>     On 2020-07-01 9:09, Dieter Fensel wrote:
>      > Actually skolemization is quite an old concept of computational
>     logic (I
>      > guess older than most of us) to deal with existential variables.
>      > Unfortunately it comes along with the typical closed world
>     assumptions
>      > of logic assuming that you know all terms and can safely generate a
>      > unique new term. In the open and dynamic environment of the web
>     this may
>      > cause problems. What happens if two people use the same
>     skolemization
>      > generator and their stuff gets merged?
> 
>     A few people (including myself) have tried to address that question
>     from
>     different perspectives, and a promising proposal is to compute a
>     deterministic skolem (IRI) for each blank node based on the information
>     connected to it (possibly recursively through other blank nodes).
> 
>     The idea is that the algorithm will assign the same skolem to two blank
>     nodes if and only if they have precisely the same surrounding
>     information in the local RDF graph (recalling that blank nodes are
>     local
>     to an RDF graph, we can know that we have the full information
>     available
>     within a single graph for each particular blank node).
> 
>     Formally the skolemisation algorithm preserves RDF isomorphism. Two RDF
>     graphs are isomorphic if and only if their skolemised versions are
>     precisely equal (which implies isomorphism).
> 
>     So this gets rid of the problems with clashes, creating skolem IDs
>     deterministically. It also means that if two parsers parse the same RDF
>     document into two graphs, but each parser assigns different labels to
>     the blank nodes, a canonical algorithm will subsequently assign the
>     corresponding blank nodes in both graphs the same skolem.
> 
>     There are some minor caveats to the process (e.g., there might be a
>     hash
>     collision, the worst cases are exponential), but in practice it works
>     well (e.g., the probably of there being a hash collision for 256 or 512
>     bit hashes is astronomically lower than the probability of a global
>     catastrophe that would render the hash collision moot especially now
>     that we are in 2020, and the worst-cases are exceedingly exotic).
> 
>     Some further reading for RDF graphs (including algorithms to further
>     preserve RDF equivalence):
> 
>     http://aidanhogan.com/docs/rdf-canonicalisation.pdf
> 
>     An implementation of the algorithm in Java:
> 
>     https://blabel.github.io/
> 
>     Work towards a spec along similar lines (including RDF datasets):
> 
>     https://json-ld.github.io/normalization/spec/
>     https://github.com/iherman/canonical_rdf
> 
> 

Received on Friday, 3 July 2020 20:10:38 UTC