Re: RDF Dataset Canonicalization - Formal Proof

It is possible to produce a digest without canonicalization by using a set
hash, which I see you mention in your paper.  I haven't read the whole
thing yet, so I don't know why you need canonicalization.  See
https://www.hpl.hp.com/techreports/2003/HPL-2003-235R1.pdf,

--------------
Alan Karp


On Sat, Mar 27, 2021 at 7:59 AM Manu Sporny <msporny@digitalbazaar.com>
wrote:

> Hi all,
>
> As many of you know, the Verifiable Credentials that we use today are
> digitally signed. In order to digitally sign and verify those VCs, there
> is a
> process called canonicalization[1] that we perform on the data. The
> specification for this algorithm has been available since ~2012[2].
>
> For those of you that are not familiar with the concept, here's an
> imperfect
> analogy that might help:
>
> Imagine someone gives you a box of marbles in a marble factory and tells
> you
> to sort them in a line. There are many people doing this in the factory and
> everyone has to sort the combinations of marbles in the same order.
> Canonicalization are the instructions that everyone uses to sort the
> marbles
> -- green ones first, then red ones, then purple ones (but never put a
> purple
> one after a red one), and so on.
>
> For the work we do today, Verifiable Credentials are the box and all the
> data
> in the Verifiable Credential are the marbles... and in order to create a
> digital signature and verify it, we all have to sort those marbles in the
> same
> way.
>
> Even though it might sound simple, it's a several hundred year old,
> fantastically hard problem -- called the graph isomorphism problem[3] --
> and a
> generalized solution to this problem, with an associated formal
> mathematical
> proof that has been peer reviewed, didn't exist until recently.
>
> The next set of items for the official W3C standards track have to do with
> canonicalization and digital signatures, and in order to propose a
> charter, we
> have to demonstrate that 1) there is a concrete proposed solution to the
> canonicalization problem, 2) it has had adequate peer review, and 3) there
> is
> enough deployment experience to feel confident that the suggested solution
> workable.
>
> This email addresses #1 above, with a formal proof for the RDF Dataset
> Canonicalization problem. People that have reviewed the paper will follow
> it
> up shortly with their findings to address #2 above. As for #3, whether
> folks
> are aware of this or not, this algorithm has been used for 6 years in
> proof of
> concept, pilot, and production deployments (so, we're very much behind the
> curve on getting this standardized).
>
> Please find the formal proof for the URDNA2015 (RDF Dataset
> Canonicalization)
> algorithm attached. I expect the people that peer reviewed the formal
> proof to
> chime in over the next couple of weeks.
>
> -- manu
>
> [1]https://en.wikipedia.org/wiki/Canonicalization
> [2]https://github.com/json-ld/rdf-dataset-canonicalization
> [3]https://en.wikipedia.org/wiki/Graph_canonization
>
> --
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
>
>

Received on Saturday, 27 March 2021 17:36:05 UTC