Re: RDF Dataset Canonicalization - Formal Proof from Bill Bradley on 2021-04-01 (public-credentials@w3.org from April 2021)

From: Bill Bradley <billbradley@mirabolic.net>
Date: Thu, 1 Apr 2021 17:12:00 -0400
To: public-credentials@w3.org
Message-Id: <C842D9E0-BF17-4F9E-B289-2E1F787730D9@mirabolic.net>

Dear all,

In an effort to verify the correctness of the RDF canonicalization algorithm (URDNA2015), Digital Bazaar hired my partner and myself to review it. After examining the proof for several weeks, we wrote a short report on our findings.  We attach it under the hopes that our notes may be useful while the proof is being scrutinized.

Very briefly, our conclusions were:
  (*) We found no substantive errors in the proof.
  (*) Hash collisions can lead to failures of canonicalization, but this is more of a nuisance than a real problem (assuming a strong hash function).
  (*) It is possible to construct small graphs for which URDNA2015 takes a huge amount of time to canonicalize, although these graphs appear unlikely to occur in real-world RDF problems.

As we are unaccustomed to dealing with the W3C, we add a brief note on ourselves: We are both mathematicians by training and have some familiarity with the graph isomorphism problem (although we were previously unfamiliar with the RDF format).  Anyone curious about our background can find out more here:
     Bill Bradley (me): https://www.linkedin.com/in/bill-bradley-a1614314/
     Karl Knaub: https://www.linkedin.com/in/karl-knaub-ab10701/


Thank you,
Bill

Attachments

text/html attachment: stored
application/pdf attachment: Mirabolic_Graph_Iso_Report_2020_10_19.pdf
text/html attachment: stored

Received on Friday, 2 April 2021 12:01:20 UTC