Re: Blank nodes must DIE! [ was Re: Blank nodes semantics - existential variables?]

Hi Aidan

I think you'll end up with false negatives that way though. I think
comparison operations for value types need to be type dependent.

Example 1:

    id: _:fraction1
    type: Fraction
    numerator: 1
    denominator: 2

    id: _:fraction2
    type: Fraction
    numerator: 2
    denominator: 4

    The algorithm will return different IDs, but they're the same *value*.

Example 2:

    id: _:fraction1
    type: Fraction
    numerator: 1
    denominator: 2
    batteryPercentageOf: laptop1

    id: _:fraction2
    type: Fraction
    numerator: 1
    denominator: 2
    batteryPercentageOf: laptop2

    Again, the algorithm will return different IDs, but they're the same
value.

Something that I think might assist in this area would be if mainstream
value types had accompanying comparison operations.

Regards
Anthony


On Thu, Jul 2, 2020 at 10:37 AM Aidan Hogan <aidhog@gmail.com> wrote:

> On 2020-07-01 9:09, Dieter Fensel wrote:
> > Actually skolemization is quite an old concept of computational logic (I
> > guess older than most of us) to deal with existential variables.
> > Unfortunately it comes along with the typical closed world assumptions
> > of logic assuming that you know all terms and can safely generate a
> > unique new term. In the open and dynamic environment of the web this may
> > cause problems. What happens if two people use the same skolemization
> > generator and their stuff gets merged?
>
> A few people (including myself) have tried to address that question from
> different perspectives, and a promising proposal is to compute a
> deterministic skolem (IRI) for each blank node based on the information
> connected to it (possibly recursively through other blank nodes).
>
> The idea is that the algorithm will assign the same skolem to two blank
> nodes if and only if they have precisely the same surrounding
> information in the local RDF graph (recalling that blank nodes are local
> to an RDF graph, we can know that we have the full information available
> within a single graph for each particular blank node).
>
> Formally the skolemisation algorithm preserves RDF isomorphism. Two RDF
> graphs are isomorphic if and only if their skolemised versions are
> precisely equal (which implies isomorphism).
>
> So this gets rid of the problems with clashes, creating skolem IDs
> deterministically. It also means that if two parsers parse the same RDF
> document into two graphs, but each parser assigns different labels to
> the blank nodes, a canonical algorithm will subsequently assign the
> corresponding blank nodes in both graphs the same skolem.
>
> There are some minor caveats to the process (e.g., there might be a hash
> collision, the worst cases are exponential), but in practice it works
> well (e.g., the probably of there being a hash collision for 256 or 512
> bit hashes is astronomically lower than the probability of a global
> catastrophe that would render the hash collision moot especially now
> that we are in 2020, and the worst-cases are exceedingly exotic).
>
> Some further reading for RDF graphs (including algorithms to further
> preserve RDF equivalence):
>
>         http://aidanhogan.com/docs/rdf-canonicalisation.pdf
>
> An implementation of the algorithm in Java:
>
>         https://blabel.github.io/
>
> Work towards a spec along similar lines (including RDF datasets):
>
>         https://json-ld.github.io/normalization/spec/
>         https://github.com/iherman/canonical_rdf
>
>
>

Received on Thursday, 2 July 2020 21:56:36 UTC