Re: RDF Dataset Canonicalization - Formal Proof

Dear all,

In an effort to verify the correctness of the RDF canonicalization algorithm (URDNA2015), Digital Bazaar hired my partner and myself to review it. After examining the proof for several weeks, we wrote a short report on our findings.  We attach it under the hopes that our notes may be useful while the proof is being scrutinized.

Very briefly, our conclusions were:
  (*) We found no substantive errors in the proof.
  (*) Hash collisions can lead to failures of canonicalization, but this is more of a nuisance than a real problem (assuming a strong hash function).
  (*) It is possible to construct small graphs for which URDNA2015 takes a huge amount of time to canonicalize, although these graphs appear unlikely to occur in real-world RDF problems.

As we are unaccustomed to dealing with the W3C, we add a brief note on ourselves: We are both mathematicians by training and have some familiarity with the graph isomorphism problem (although we were previously unfamiliar with the RDF format).  Anyone curious about our background can find out more here:
     Bill Bradley (me): https://www.linkedin.com/in/bill-bradley-a1614314/
     Karl Knaub: https://www.linkedin.com/in/karl-knaub-ab10701/


Thank you,
Bill

Received on Friday, 2 April 2021 12:01:20 UTC