- From: Anthony Moretti <anthony.moretti@gmail.com>
- Date: Thu, 2 Jul 2020 14:56:10 -0700
- To: Aidan Hogan <aidhog@gmail.com>
- Cc: Semantic Web <semantic-web@w3.org>
- Message-ID: <CACusdfSP1hR-vbbZveo60RRK1TvjiZXkT2uQd4-rnq4r=e6CvQ@mail.gmail.com>
Hi Aidan
I think you'll end up with false negatives that way though. I think
comparison operations for value types need to be type dependent.
Example 1:
id: _:fraction1
type: Fraction
numerator: 1
denominator: 2
id: _:fraction2
type: Fraction
numerator: 2
denominator: 4
The algorithm will return different IDs, but they're the same *value*.
Example 2:
id: _:fraction1
type: Fraction
numerator: 1
denominator: 2
batteryPercentageOf: laptop1
id: _:fraction2
type: Fraction
numerator: 1
denominator: 2
batteryPercentageOf: laptop2
Again, the algorithm will return different IDs, but they're the same
value.
Something that I think might assist in this area would be if mainstream
value types had accompanying comparison operations.
Regards
Anthony
On Thu, Jul 2, 2020 at 10:37 AM Aidan Hogan <aidhog@gmail.com> wrote:
> On 2020-07-01 9:09, Dieter Fensel wrote:
> > Actually skolemization is quite an old concept of computational logic (I
> > guess older than most of us) to deal with existential variables.
> > Unfortunately it comes along with the typical closed world assumptions
> > of logic assuming that you know all terms and can safely generate a
> > unique new term. In the open and dynamic environment of the web this may
> > cause problems. What happens if two people use the same skolemization
> > generator and their stuff gets merged?
>
> A few people (including myself) have tried to address that question from
> different perspectives, and a promising proposal is to compute a
> deterministic skolem (IRI) for each blank node based on the information
> connected to it (possibly recursively through other blank nodes).
>
> The idea is that the algorithm will assign the same skolem to two blank
> nodes if and only if they have precisely the same surrounding
> information in the local RDF graph (recalling that blank nodes are local
> to an RDF graph, we can know that we have the full information available
> within a single graph for each particular blank node).
>
> Formally the skolemisation algorithm preserves RDF isomorphism. Two RDF
> graphs are isomorphic if and only if their skolemised versions are
> precisely equal (which implies isomorphism).
>
> So this gets rid of the problems with clashes, creating skolem IDs
> deterministically. It also means that if two parsers parse the same RDF
> document into two graphs, but each parser assigns different labels to
> the blank nodes, a canonical algorithm will subsequently assign the
> corresponding blank nodes in both graphs the same skolem.
>
> There are some minor caveats to the process (e.g., there might be a hash
> collision, the worst cases are exponential), but in practice it works
> well (e.g., the probably of there being a hash collision for 256 or 512
> bit hashes is astronomically lower than the probability of a global
> catastrophe that would render the hash collision moot especially now
> that we are in 2020, and the worst-cases are exceedingly exotic).
>
> Some further reading for RDF graphs (including algorithms to further
> preserve RDF equivalence):
>
> http://aidanhogan.com/docs/rdf-canonicalisation.pdf
>
> An implementation of the algorithm in Java:
>
> https://blabel.github.io/
>
> Work towards a spec along similar lines (including RDF datasets):
>
> https://json-ld.github.io/normalization/spec/
> https://github.com/iherman/canonical_rdf
>
>
>
Received on Thursday, 2 July 2020 21:56:36 UTC