- From: Nikos Fotiou <fotiou@aueb.gr>
- Date: Thu, 11 Mar 2021 03:44:31 +0200
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: public-credentials@w3.org
- Message-Id: <091EC035-821C-408A-8A41-8871DF3AE092@aueb.gr>
Hi Manu, Thanks for your reply. I don't propose to change anything I am just trying to get lessons for the future :) I just want to comment on your last part of your message. > 1. Fork a single community working in peace on codec > tables into two communities that are effectively doing > the same thing, except for the way that they encode > bytes. Don't we do this right now? Transmute does that. Digital Bazaar does that (https://github.com/digitalbazaar/ed25519-verification-key-2018/blob/019e19478d60932f87bf0cb75d6f643873a22882/src/Ed25519VerificationKey2018.js#L255). They don't parse a multicodec header, they don't decode varints, they just read the value of two bytes. Even did:key specification says more or less "use 6Mk" to denote a Ed25519 public key :) And of course I don't blame them, it is much faster to transform "Mutlicodec table" entries in an encoding that humans understand and all programming languages can trivially implement, than doing varint maths :) Best, Nikos -- Nikos Fotiou - http://pages.cs.aueb.gr/~fotiou Researcher - Mobile Multimedia Laboratory Athens University of Economics and Business https://mm.aueb.gr > On 11 Mar 2021, at 1:00 AM, Manu Sporny <msporny@digitalbazaar.com> wrote: > > On 3/10/21 4:08 PM, Nikos Fotiou wrote: >> To begin with I want to clarify that I do not want to discredit anybody: I >> am big fun of the work of both Digital Bazaar and Protocol Labs. > > Hi Nikos, I wouldn't take your comments as negative... quite the contrary, > it's nice to see someone so passionate about byte encoding. I share your > passion and had the same sort of gut reaction that you did when I first came > across Multicodec and it blew up in my face (multiple times). > >> I would like to share with you my frustration about did:key and in >> particular its use of MULTICODEC. All started when I tried to understand >> why Ed25519-based DIDs start with z6Mk. z is obvious, it means base58 in >> the MULTIBASE world, but 6Mk was still a mystery. The entry for "Ed25519 >> public key" is the "Mutlicodec table" >> (https://github.com/multiformats/multicodec/blob/master/table.csv) is >> "0xed" So how come "0xed" is translated into "6Mk"? > > Yep, your journey sounds very similar to mine so far. :) > >> It turns out that MULTICODEC uses an uncommon way for storing integers >> called "varint" (https://github.com/multiformats/unsigned-varint)! Using >> this encoding "0xed" is translated into two bytes. What is worse this type >> of encoding is not natively supported by mainstream languages > > Understanding why Multicodec uses varints is interesting... Some of the first > implementations of IPFS libraries were written in Go (a programming language). > Go natively supports varints: > > https://golang.org/src/encoding/binary/varint.go > > Note that Google's Protobufs also support varints: > > https://developers.google.com/protocol-buffers/docs/encoding#varints > > IPFS (really, it was Juan Benet, IIRC) just used the first hammer that they > had access to here. > > So, if you're going to blame someone for the choice -- blame Google (Sanjay) > first, then Protocol Labs (Juan), /then/ Digital Bazaar (me) -- that's the > proper inheritance order of blame (shame?). :P > >> and you have either to rely on an external library or start playing with >> bits in order to use it! Of course, when it comes to real systems, you >> realize that all these are useless and people just use hardcoded values of >> the ordinary bytes eventually formed (see for example > > Yeah... and I'm not sure that's a bad thing... generalized algorithm + compact > representation + active > community building codec tables + ability to hard code values... sounds like a > winner. :) > >> So what would be a better way for doing the same thing? Just use a byte to >> express "length in bytes" and then use up to 255 bytes to encode what ever >> you want. This is what CoAP and other binary protocols do. It would take >> only 3 bytes and less than 5 lines of code in any language to encode any >> entry currently included in the "Mutlicodec table". > > Just to be clear, are you suggesting we: > > 1. Fork a single community working in peace on codec > tables into two communities that are effectively doing > the same thing, except for the way that they encode > bytes. > > 2. Expand the number of lines of code necessary to check > the multicodec header by 500%. > > 3. Expand the storage requirements by 33%? > > 4. Create new implementations for all the languages > that currently support multicodec. > > 5. Create tests suites for all the implementations. > > 6. Start the clock over on years of interoperability > testing done on multicodec. > > ... I think you get the point, right? > > The choice to use varints wasn't because they were the easiest to understand > encoding format. It was because there was already a large community behind > varints, they are efficient for small values that could have a long tail, > there was a community building useful multicodec tables (and multihash, > multiformats, etc.) that was using them, and there were tons of implementations. > > When you're trying to standardize something, it's hard to look at that and go > "No thanks, I think I'll start over from scratch." :P > > Does that resonate with you, Nikos? > > -- manu > > PS: I did thoroughly enjoy your rant... because it's how most of us felt when > varint blew up in our faces for the first time. :) > > -- > Manu Sporny - https://www.linkedin.com/in/manusporny/ > Founder/CEO - Digital Bazaar, Inc. > blog: Veres One Decentralized Identifier Blockchain Launches > https://tinyurl.com/veres-one-launches > >
Attachments
- application/pkcs7-signature attachment: smime.p7s
Received on Thursday, 11 March 2021 01:44:51 UTC