W3C home > Mailing lists > Public > public-credentials@w3.org > April 2020

Re: Question on use of base64 vs base64url in modern specifications

From: Orie Steele <orie@transmute.industries>
Date: Sun, 26 Apr 2020 11:33:38 -0500
Message-ID: <CAN8C-_+UtOseMAU4aMohm4BJfNU7N0=Wqg9pa4JdfV0yZZB9kw@mail.gmail.com>
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: "W3C Credentials CG (Public List)" <public-credentials@w3.org>
I do like multibase... look how clear it is:


pairs well with multi codec:


IMO, if you are going to make a new string encoding standard that has built
in error correction, is friendly for people typing in by hand, and is
optimized for size... you should plan for multibase / multicodec support.

... but from a design perspective, I question if it's a good idea to try
and do all these things at once... for example, if you take a string which
has error correction on it already, and you convert it to a QR code, now it
has double error correction... error correction and size are at odds with
each other... optimization of strings for typing by hand is also at odds
with size...

I would prefer a layered approach:

layer 0 - optimized for size / compressed binary
layer 1 - optimized for "copy/paste" / URL Safe... but not meant to be
typed in by hand.
layer 2 - optimized for error correction / checksum... assumes that humans
will be typing things / making mistakes or that the data will be
transferred over laser/ camera / radio...which also make mistakes.... size
doesn't matter.
layer 3 - optimized for character disambiguation... assumes that only
humans will be typing things / making mistakes...

Layer 3 represents why i dislike base58... who cares if "I" and "l" look

If 99% of the format is processed by machines / not typed in by humans...
Why care about humans and their poor eyesight? base58 is what you get when
you can't decide if you want to be a rogue or a warrior in an rpg... you
end up with a mediocre melee character that can't tank or deal damage... an
encoding that sacrifices size for human readability, in a world where most
of the time a machine processes the data.

layer 0 and layer 1 are very different from layer 2 and layer 3 which are
much more focused on specific scenarios... you want to have the right tool
for the right job, and each tool optimized for doing 1 thing very well, not
5 things ok...and naming is part of being a good tool... when your dad asks
for a screwdriver, you don't hand him one, you ask what kind... when
someone asks for base58 or base64... you should do the same.

IMO, saying it's "multicodec / multibase" is about a billion times better
than saying "its base64 / base58".


On Sun, Apr 26, 2020 at 10:23 AM Manu Sporny <msporny@digitalbazaar.com>

> On 4/24/20 3:24 PM, Christopher Allen wrote:
> > But the question I have then is, why use the older base64 at all? Why
> > not completely deprecate base64 entirely for brand new standards? Or is
> > it solely that base64URL also "forbids line separators"? Is this the
> > only reason why the older base64 is still used in new standards? Or am I
> > missing something?
> It's worse than that... there are 11 variants for base64 (and some even
> crazier variants that Digital Bazaar has seen in the wild):
> https://en.wikipedia.org/wiki/Base64#Variants_summary_table
> Base64 is a mess specifically because of all of the optionality... we
> shouldn't pull it forward into new standards unless only *one* encoding
> of base64 is settled upon for very specific use cases.
> Even base64url has two variants (a padded version and an unpadded
> version). The Multibase spec recognizes the 4 most widely used base64
> formats:
> https://tools.ietf.org/id/draft-multiformats-multibase-00.html#rfc.appendix.D.1
> In any case, boo base64 (for 99% of use cases where you're encoding
> something less than a few hundred bytes in size). For stuff that's
> larger, like PNG images... base64pad all the way.
> *grabs popcorn, awaits encoding-format related Chair throwing*
> -- manu
> --
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches

Chief Technical Officer

Received on Sunday, 26 April 2020 16:34:03 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:24:58 UTC