Re: Some (negative) thoughts about did:key and multicodec.

Hi Manu,

Thanks for your reply.

I don't propose to change anything I am just trying to get lessons for the future :)

I just want to comment on your last part of your message. 

> 1. Fork a single community working in peace on codec
>   tables into two communities that are effectively doing
>   the same thing, except for the way that they encode
>   bytes.

Don't we do this right now? Transmute does that. Digital Bazaar does that (https://github.com/digitalbazaar/ed25519-verification-key-2018/blob/019e19478d60932f87bf0cb75d6f643873a22882/src/Ed25519VerificationKey2018.js#L255). They don't parse a multicodec header, they don't decode varints,  they just read the value of two bytes. Even did:key specification says more or less "use 6Mk" to denote a Ed25519 public key :) And of course I don't blame them, it is much faster to transform "Mutlicodec table" entries in an encoding that humans understand and all programming languages can trivially implement, than doing varint maths :)

Best,
Nikos

--
Nikos Fotiou - http://pages.cs.aueb.gr/~fotiou
Researcher - Mobile Multimedia Laboratory
Athens University of Economics and Business
https://mm.aueb.gr

> On 11 Mar 2021, at 1:00 AM, Manu Sporny <msporny@digitalbazaar.com> wrote:
> 
> On 3/10/21 4:08 PM, Nikos Fotiou wrote:
>> To begin with I want to clarify that I do not want to discredit anybody: I
>> am big fun of the work of both Digital Bazaar and Protocol Labs.
> 
> Hi Nikos, I wouldn't take your comments as negative... quite the contrary,
> it's nice to see someone so passionate about byte encoding. I share your
> passion and had the same sort of gut reaction that you did when I first came
> across Multicodec and it blew up in my face (multiple times).
> 
>> I would like to share with you my frustration about did:key and in 
>> particular its use of MULTICODEC. All started when I tried to understand 
>> why Ed25519-based DIDs start with z6Mk. z is obvious, it means base58 in 
>> the MULTIBASE world, but 6Mk was still a mystery. The entry for "Ed25519 
>> public key" is the "Mutlicodec table" 
>> (https://github.com/multiformats/multicodec/blob/master/table.csv) is 
>> "0xed" So how come "0xed" is translated into "6Mk"?
> 
> Yep, your journey sounds very similar to mine so far. :)
> 
>> It turns out that  MULTICODEC uses an uncommon way for storing integers 
>> called "varint" (https://github.com/multiformats/unsigned-varint)! Using 
>> this encoding "0xed" is translated into two bytes.  What is worse this type
>> of encoding is not natively supported by mainstream languages
> 
> Understanding why Multicodec uses varints is interesting... Some of the first
> implementations of IPFS libraries were written in Go (a programming language).
> Go natively supports varints:
> 
> https://golang.org/src/encoding/binary/varint.go
> 
> Note that Google's Protobufs also support varints:
> 
> https://developers.google.com/protocol-buffers/docs/encoding#varints
> 
> IPFS (really, it was Juan Benet, IIRC) just used the first hammer that they
> had access to here.
> 
> So, if you're going to blame someone for the choice -- blame Google (Sanjay)
> first, then Protocol Labs (Juan), /then/ Digital Bazaar (me) -- that's the
> proper inheritance order of blame (shame?). :P
> 
>> and you have either to rely on an external library or start playing with 
>> bits in order to use it! Of course, when it comes to real systems, you 
>> realize that all these are useless and people just use hardcoded values of 
>> the ordinary bytes eventually formed (see for example
> 
> Yeah... and I'm not sure that's a bad thing... generalized algorithm + compact
> representation + active
> community building codec tables + ability to hard code values... sounds like a
> winner. :)
> 
>> So what would be a better way for doing the same thing? Just use a byte to
>> express "length in bytes" and then use up to 255 bytes to encode what ever
>> you want. This is what CoAP and other binary protocols do.  It would take
>> only 3 bytes and less than 5 lines of code in any language to encode any 
>> entry currently included in the "Mutlicodec table".
> 
> Just to be clear, are you suggesting we:
> 
> 1. Fork a single community working in peace on codec
>   tables into two communities that are effectively doing
>   the same thing, except for the way that they encode
>   bytes.
> 
> 2. Expand the number of lines of code necessary to check
>   the multicodec header by 500%.
> 
> 3. Expand the storage requirements by 33%?
> 
> 4. Create new implementations for all the languages
>   that currently support multicodec.
> 
> 5. Create tests suites for all the implementations.
> 
> 6. Start the clock over on years of interoperability
>   testing done on multicodec.
> 
> ... I think you get the point, right?
> 
> The choice to use varints wasn't because they were the easiest to understand
> encoding format. It was because there was already a large community behind
> varints, they are efficient for small values that could have a long tail,
> there was a community building useful multicodec tables (and multihash,
> multiformats, etc.) that was using them, and there were tons of implementations.
> 
> When you're trying to standardize something, it's hard to look at that and go
> "No thanks, I think I'll start over from scratch." :P
> 
> Does that resonate with you, Nikos?
> 
> -- manu
> 
> PS: I did thoroughly enjoy your rant... because it's how most of us felt when
> varint blew up in our faces for the first time. :)
> 
> -- 
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
> 
> 

Received on Thursday, 11 March 2021 01:44:51 UTC