- From: Brent Shambaugh <brent.shambaugh@gmail.com>
- Date: Wed, 10 Mar 2021 17:14:24 -0600
- To: Manu Sporny <msporny@digitalbazaar.com>
- Cc: Credentials Community Group <public-credentials@w3.org>
- Message-ID: <CACvcBVrkY1QT3RM3SzN5QcGgZHqjGtC7cxN6Si4=PKHY=TKLiA@mail.gmail.com>
echo. I was confused a bit about varints when diving into it. Here are just some of my notes: http://raptorlicious.blogspot.com/2021/02/docid-cid-and-varint-explorations.html https://gist.github.com/bshambaugh/3c3e3d2591a5b0f14726ba13df0c384c I was saved from some confusion when the did:key draft was updated to reflect varints during my struggle (output from js-multicodec): https://w3c-ccg.github.io/did-method-key/#p-256 But trying to work out varints using the google dev link on paper blew my mind a bit. You can see some of them in my raptorlicious blog. https://developers.google.com/protocol-buffers/docs/encoding#varints I wondered at the time what place varints had given that they were for developed for network reasons. I guess it is just legacy, not reinvent the wheel. -Brent Shambaugh GitHub: https://github.com/bshambaugh Website: http://bshambaugh.org/ LinkedIN: https://www.linkedin.com/in/brent-shambaugh-9b91259 Skype: brent.shambaugh Twitter: https://twitter.com/Brent_Shambaugh WebID: http://bshambaugh.org/foaf.rdf#me On Wed, Mar 10, 2021 at 5:03 PM Manu Sporny <msporny@digitalbazaar.com> wrote: > On 3/10/21 4:08 PM, Nikos Fotiou wrote: > > To begin with I want to clarify that I do not want to discredit anybody: > I > > am big fun of the work of both Digital Bazaar and Protocol Labs. > > Hi Nikos, I wouldn't take your comments as negative... quite the contrary, > it's nice to see someone so passionate about byte encoding. I share your > passion and had the same sort of gut reaction that you did when I first > came > across Multicodec and it blew up in my face (multiple times). > > > I would like to share with you my frustration about did:key and in > > particular its use of MULTICODEC. All started when I tried to understand > > why Ed25519-based DIDs start with z6Mk. z is obvious, it means base58 in > > the MULTIBASE world, but 6Mk was still a mystery. The entry for "Ed25519 > > public key" is the "Mutlicodec table" > > (https://github.com/multiformats/multicodec/blob/master/table.csv) is > > "0xed" So how come "0xed" is translated into "6Mk"? > > Yep, your journey sounds very similar to mine so far. :) > > > It turns out that MULTICODEC uses an uncommon way for storing integers > > called "varint" (https://github.com/multiformats/unsigned-varint)! > Using > > this encoding "0xed" is translated into two bytes. What is worse this > type > > of encoding is not natively supported by mainstream languages > > Understanding why Multicodec uses varints is interesting... Some of the > first > implementations of IPFS libraries were written in Go (a programming > language). > Go natively supports varints: > > https://golang.org/src/encoding/binary/varint.go > > Note that Google's Protobufs also support varints: > > https://developers.google.com/protocol-buffers/docs/encoding#varints > > IPFS (really, it was Juan Benet, IIRC) just used the first hammer that they > had access to here. > > So, if you're going to blame someone for the choice -- blame Google > (Sanjay) > first, then Protocol Labs (Juan), /then/ Digital Bazaar (me) -- that's the > proper inheritance order of blame (shame?). :P > > > and you have either to rely on an external library or start playing with > > bits in order to use it! Of course, when it comes to real systems, you > > realize that all these are useless and people just use hardcoded values > of > > the ordinary bytes eventually formed (see for example > > Yeah... and I'm not sure that's a bad thing... generalized algorithm + > compact > representation + active > community building codec tables + ability to hard code values... sounds > like a > winner. :) > > > So what would be a better way for doing the same thing? Just use a byte > to > > express "length in bytes" and then use up to 255 bytes to encode what > ever > > you want. This is what CoAP and other binary protocols do. It would > take > > only 3 bytes and less than 5 lines of code in any language to encode > any > > entry currently included in the "Mutlicodec table". > > Just to be clear, are you suggesting we: > > 1. Fork a single community working in peace on codec > tables into two communities that are effectively doing > the same thing, except for the way that they encode > bytes. > > 2. Expand the number of lines of code necessary to check > the multicodec header by 500%. > > 3. Expand the storage requirements by 33%? > > 4. Create new implementations for all the languages > that currently support multicodec. > > 5. Create tests suites for all the implementations. > > 6. Start the clock over on years of interoperability > testing done on multicodec. > > ... I think you get the point, right? > > The choice to use varints wasn't because they were the easiest to > understand > encoding format. It was because there was already a large community behind > varints, they are efficient for small values that could have a long tail, > there was a community building useful multicodec tables (and multihash, > multiformats, etc.) that was using them, and there were tons of > implementations. > > When you're trying to standardize something, it's hard to look at that and > go > "No thanks, I think I'll start over from scratch." :P > > Does that resonate with you, Nikos? > > -- manu > > PS: I did thoroughly enjoy your rant... because it's how most of us felt > when > varint blew up in our faces for the first time. :) > > -- > Manu Sporny - https://www.linkedin.com/in/manusporny/ > Founder/CEO - Digital Bazaar, Inc. > blog: Veres One Decentralized Identifier Blockchain Launches > https://tinyurl.com/veres-one-launches > > >
Received on Wednesday, 10 March 2021 23:14:50 UTC