Re: Some (negative) thoughts about did:key and multicodec. from Nikos Fotiou on 2021-03-11 (public-credentials@w3.org from March 2021)

From: Nikos Fotiou <fotiou@aueb.gr>
Date: Fri, 12 Mar 2021 00:09:24 +0200
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: Credentials Community Group <public-credentials@w3.org>
Message-Id: <8F6948A8-B402-4469-95C4-9480596FD802@aueb.gr>
Hi Manu, all,

> So, all that to say, I'm not sure we'd end up in a different place if we went
> the CoAP/CBOR unsigned integer encoding route. It's just a different way of doing
> byte encoding that ends up not really resulting in anything different.

That's an interesting thought exercise. What you really want here is just to represent the entry of a "codec" in a "codec table" that is used "only for common things". IMHO if you plan to have such a codec table that includes more than 2^16 entries (2 bytes) then you real problem is the table itself :) The use of varint (or of any other encoding) for this case, at least to me, is the perfect example of "if all you have is a hammer, everything looks like a nail". 

Best,
Nikos

--
Nikos Fotiou - http://pages.cs.aueb.gr/~fotiou
Researcher - Mobile Multimedia Laboratory
Athens University of Economics and Business
https://mm.aueb.gr

> On 11 Mar 2021, at 5:40 PM, Manu Sporny <msporny@digitalbazaar.com> wrote:
> 
> On 3/10/21 8:44 PM, Nikos Fotiou wrote:
>>> 1. Fork a single community working in peace on codec tables into two 
>>> communities that are effectively doing the same thing, except for the
>>> way that they encode bytes.
>> 
>> Don't we do this right now?
> 
> No, I don't think we've split the two communities. We are re-using the
> multicodec table for did:key. The thing that did:key does is choose to only
> use base58btc and only a handful of the multicodec entries... which allows us
> to hardcode values.
> 
> In the future, let's imagine that there is a post-quantum key format that
> did:key wants to support. We put that public key format into the multicodec
> table and then use the hard coded value in the code.
> 
> Think of it as doing things at two layers. Specifications should generalize,
> so that they future-proof themselves and ideally, build on top of other
> specifications that allow people to reason about them.
> 
> The generalization that did:key builds on top of is multibase and multicodec
> to identify the public key type.
> 
> However, we subset both of those specifications... and only use base58 and
> only a handful of public key formats encoded in multicodec to make
> implementations easier.
> 
>> Transmute does that. Digital Bazaar does that They don't parse a multicodec
>> header, they don't decode varints,  they just read the value of two bytes.
>> Even did:key specification says more or less "use 6Mk" to denote a Ed25519
>> public key :)
> 
> Yes, correct... and I'd argue that this is the best of both worlds.
> 
> You had the curiosity to go in and figure out how all of this works, which is
> great. A non-trivial number of developers don't have the time or energy to do
> that, and instead take short cuts and cargo-cult their way through
> implementations. While that's less ideal, when people do that, you still want
> the system to work... and in this case it does. People don't have to
> understand varints to create a working implementation of did:key.
> 
> The ones that try to understand it at depth, like you and me, dip into
> insanity for a brief spell before coming out of it and going "meh, I guess the
> trade-off is acceptable".
> 
> So that's why I think we've made the right call here... we're trying to solve
> a multivariate equation that results in a proper implementation (and achieves
> the other goals listed in the previous email).
> 
>> And of course I don't blame them, it is much faster to transform 
>> "Mutlicodec table" entries in an encoding that humans understand and all 
>> programming languages can trivially implement, than doing varint maths :)
> 
> I skipped going into this detail before, as I didn't think it was worth
> highlighting, but it might be worth highlighting now. You had mentioned
> something to the effect of "this is what CoAP does" in a previous email.
> 
> I wanted to point out that CBOR (and CoAP) do their own variation of varints.
> In a single byte, you can have major type bits followed by the value in a
> single byte:
> 
> https://www.rfc-editor.org/rfc/rfc8949.html#section-4.2.1
> 
> In a perfect world, we would've just used this (and not varints)... but that's
> not the path that IPFS took (for the reasons previously mentioned).
> 
> All that to say, if you rewind time and go with the way CBOR encodes variable
> length integers... you will still confuse people... because the second your
> integer value gets to be larger than 23, you need two bytes. So, taking the
> exact example that tripped you up -- 0xed becomes 0x18ed instead of 0xed01
> (which is arguably better/worse).
> 
> So, all that to say, I'm not sure we'd end up in a different place if we went
> the CoAP/CBOR unsigned integer encoding route. It's just a different way of doing
> byte encoding that ends up not really resulting in anything different. Am I
> missing something?
> 
> -- manu
> 
> -- 
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
> 
>
Attachments

application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 11 March 2021 22:10:18 UTC