W3C home > Mailing lists > Public > public-credentials@w3.org > March 2021

Re: RDF Dataset Canonicalization - Formal Proof

From: Drummond Reed <drummond.reed@evernym.com>
Date: Sat, 27 Mar 2021 17:13:53 -0700
Message-ID: <CAAjunnav8OzMB9eviJaX76apHLYSEFH2s5bpVt=MEtH=MRyCWA@mail.gmail.com>
To: Alan Karp <alanhkarp@gmail.com>
Cc: Adrian Gropper <agropper@healthurl.com>, Manu Sporny <msporny@digitalbazaar.com>, "W3C Credentials CG (Public List)" <public-credentials@w3.org>
Alan, I just want to commend you for an exceptionally good plain English
explanation of the set hash approach. I too was not familiar with that.

=Drummond

On Sat, Mar 27, 2021 at 4:57 PM Alan Karp <alanhkarp@gmail.com> wrote:

> Adrian Gropper <agropper@healthurl.com> wrote:
>
>> I think I understand canonicalization but I would appreciate a plain
>> language
>> <https://www.explainxkcd.com/wiki/index.php?title=1133:_Up_Goer_Five>
>> explanation of what Manu and Alan are talking about. Ideally, there would
>> be a use-case to illustrate the utility.
>>
>
> Let's say you have the following two sentences.  "Alice and Bob went to
> the store." and "Bob and Alice went to the store."  The hashes of those two
> sentences are different even though they mean the same thing.
> Canonicalization might say that two names separated by an "and" must be
> reordered so they are alphabetical.  In that case, you change the second
> sentence to match the first before computing the hash.  That works, but as
> Manu pointed out, getting the canonicalization rules right is hard.
>
> What we showed in our paper was a different approach.  You can combine the
> hashes of the individual words of the original sentences in such a way that
> the hashes are the same.  It's called a "set hash" because the result of
> hashing a set doesn't depend on the order in which you pick the items.  I
> first learned of the concept from the Zobrist hash, which is used in
> computer chess to detect if you've seen a particular position before during
> your search.
>
> --------------
> Alan Karp
>
>
> On Sat, Mar 27, 2021 at 4:28 PM Adrian Gropper <agropper@healthurl.com>
> wrote:
>
>> I think I understand canonicalization but I would appreciate a plain
>> language
>> <https://www.explainxkcd.com/wiki/index.php?title=1133:_Up_Goer_Five>
>> explanation of what Manu and Alan are talking about. Ideally, there would
>> be a use-case to illustrate the utility.
>>
>> Adrian
>>
>> On Sat, Mar 27, 2021 at 6:01 PM Alan Karp <alanhkarp@gmail.com> wrote:
>>
>>> Feel free to lift any sections you like.  As far as digital signatures
>>> goes, I don't recall.  It might simply be that we assumed people knew you
>>> could sign once you had the digest.
>>>
>>> --------------
>>> Alan Karp
>>>
>>>
>>> On Sat, Mar 27, 2021 at 12:43 PM Manu Sporny <msporny@digitalbazaar.com>
>>> wrote:
>>>
>>>> On 3/27/21 3:01 PM, Alan Karp wrote:
>>>> > Yeah.  I'm still trying to figure out what I'm going to be when I
>>>> grow up.
>>>>
>>>> *lol*, aren't we all! :P
>>>>
>>>> > One issue you didn't mention about our paper is that a set hash is
>>>> weaker
>>>> > against collision attacks. I thought that might be the reason you
>>>> couldn't
>>>> >  use that approach.
>>>>
>>>> Well, yes... I wanted to say something about that, but could also see
>>>> how you
>>>> could *maybe* mitigate that using large enough hashes and/or Section
>>>> 6.2.2 --
>>>> making the combining function be multiplication mod some suitably-large
>>>> prime
>>>> number. This was the part of the paper that interested me the most,
>>>> Alan... I
>>>> could see how that would work IF we didn't have to depend on a
>>>> pre-determined
>>>> set of node labels.
>>>>
>>>> There are performance improvements that we know are probably still
>>>> locked up
>>>> in the algorithm, but we needed to ship something (nine years ago) and
>>>> we
>>>> really haven't seen a case where performance was an issue.
>>>>
>>>> > In case you're interested, we wrote a follow-up,
>>>> > https://www.hpl.hp.com/techreports/2004/HPL-2004-95.pdf
>>>>
>>>> Would you mind it if we unceremoniously lift applicable parts of
>>>> "Section 5:
>>>> Application for Graph Digests" from that document for the use cases
>>>> document?
>>>> Any reason you didn't include digital signatures in the applications
>>>> section?
>>>>
>>>> -- manu
>>>>
>>>> --
>>>> Manu Sporny - https://www.linkedin.com/in/manusporny/
>>>> Founder/CEO - Digital Bazaar, Inc.
>>>> blog: Veres One Decentralized Identifier Blockchain Launches
>>>> https://tinyurl.com/veres-one-launches
>>>>
>>>>
Received on Sunday, 28 March 2021 00:14:18 UTC

This archive was generated by hypermail 2.4.0 : Sunday, 28 March 2021 00:14:20 UTC