- From: Alan Karp <alanhkarp@gmail.com>
- Date: Sat, 27 Mar 2021 12:01:39 -0700
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: W3C Credentials CG <public-credentials@w3.org>
- Message-ID: <CANpA1Z3qHoH_sPodk5n95c-9Hs6x-veC7xN1sJY7KxtpRAnu=w@mail.gmail.com>
Manu Sporny <msporny@digitalbazaar.com> wrote: > > Hey Alan, I didn't know that you had worked in the graph digest space > before... neat! :) > Yeah. I'm still trying to figure out what I'm going to be when I grow up. At this point, you understand my paper better than I do :) I expected you had a good reason for canonicalization, but I just wanted to make sure you were aware of the possibility of doing without it. One issue you didn't mention about our paper is that a set hash is weaker against collision attacks. I thought that might be the reason you couldn't use that approach. In case you're interested, we wrote a follow-up, https://www.hpl.hp.com/techreports/2004/HPL-2004-95.pdf. -------------- Alan Karp On Sat, Mar 27, 2021 at 11:51 AM Manu Sporny <msporny@digitalbazaar.com> wrote: > On 3/27/21 1:35 PM, Alan Karp wrote: > > It is possible to produce a digest without canonicalization by using a > set > > hash, which I see you mention in your paper. I haven't read the whole > > thing yet, so I don't know why you need canonicalization. See > > https://www.hpl.hp.com/techreports/2003/HPL-2003-235R1.pdf > > Hey Alan, I didn't know that you had worked in the graph digest space > before... neat! :) > > I just read your paper front to back several times. I'm fairly certain I > understand it: > > Fundamentally, the paper uses a cryptographic combination function over > every > statement in a graph that is both associative and commutative to generate a > final hash value in O(n) (with some additional protections against > adversarial > brute forcing). > > There are two things the paper also depends on that make the solution > non-generalized: 1) you presume you know how the sender is going to label > the > blank nodes, or 2) you annotate the graph with extra information about the > blank nodes. The first is an assumption you cannot depend on in a global > system, the second modifies the graph in ways that require an annotation > language and a way of encapsulating the annotations in the message to the > receiver (unacceptable complexity for most of the use cases we're dealing > with). > > In addition, the solution only contemplates RDF Graphs and not RDF Datasets > (which didn't exist at the time the paper was written). > > So, while it is true that you can produce *a* digest w/o doing > canonicalization using set hashes, the solution presumes things that we > can't > presume in real world scenarios. For example, a non-trivial subset of > developers are not going to understand why they have to "always generate a > unique cryptographic ID for every one of their JSON objects". Or why the > data > coming into their system has a whole bunch of annotations in their JSON > objects that look like garbage to them, that they then have to turn around > and > store in their databases. > > We had considered all of these things before writing the RDF Dataset > Canonicalization paper. > > -- manu > > -- > Manu Sporny - https://www.linkedin.com/in/manusporny/ > Founder/CEO - Digital Bazaar, Inc. > blog: Veres One Decentralized Identifier Blockchain Launches > https://tinyurl.com/veres-one-launches > > >
Received on Saturday, 27 March 2021 19:03:04 UTC