Re: Question on use of base64 vs base64url in modern specifications from Orie Steele on 2020-04-26 (public-credentials@w3.org from April 2020)

From: Orie Steele <orie@transmute.industries>
Date: Sun, 26 Apr 2020 11:33:38 -0500
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: "W3C Credentials CG (Public List)" <public-credentials@w3.org>
Message-ID: <CAN8C-_+UtOseMAU4aMohm4BJfNU7N0=Wqg9pa4JdfV0yZZB9kw@mail.gmail.com>

I do like multibase... look how clear it is:

https://github.com/multiformats/multibase/blob/master/multibase.csv

pairs well with multi codec:

https://github.com/multiformats/multicodec/blob/master/table.csv

IMO, if you are going to make a new string encoding standard that has built
in error correction, is friendly for people typing in by hand, and is
optimized for size... you should plan for multibase / multicodec support.

... but from a design perspective, I question if it's a good idea to try
and do all these things at once... for example, if you take a string which
has error correction on it already, and you convert it to a QR code, now it
has double error correction... error correction and size are at odds with
each other... optimization of strings for typing by hand is also at odds
with size...

I would prefer a layered approach:

layer 0 - optimized for size / compressed binary
layer 1 - optimized for "copy/paste" / URL Safe... but not meant to be
typed in by hand.
layer 2 - optimized for error correction / checksum... assumes that humans
will be typing things / making mistakes or that the data will be
transferred over laser/ camera / radio...which also make mistakes.... size
doesn't matter.
layer 3 - optimized for character disambiguation... assumes that only
humans will be typing things / making mistakes...

Layer 3 represents why i dislike base58... who cares if "I" and "l" look
similar...

If 99% of the format is processed by machines / not typed in by humans...
Why care about humans and their poor eyesight? base58 is what you get when
you can't decide if you want to be a rogue or a warrior in an rpg... you
end up with a mediocre melee character that can't tank or deal damage... an
encoding that sacrifices size for human readability, in a world where most
of the time a machine processes the data.

layer 0 and layer 1 are very different from layer 2 and layer 3 which are
much more focused on specific scenarios... you want to have the right tool
for the right job, and each tool optimized for doing 1 thing very well, not
5 things ok...and naming is part of being a good tool... when your dad asks
for a screwdriver, you don't hand him one, you ask what kind... when
someone asks for base58 or base64... you should do the same.

IMO, saying it's "multicodec / multibase" is about a billion times better
than saying "its base64 / base58".

:)

On Sun, Apr 26, 2020 at 10:23 AM Manu Sporny <msporny@digitalbazaar.com>
wrote:

> On 4/24/20 3:24 PM, Christopher Allen wrote:
> > But the question I have then is, why use the older base64 at all? Why
> > not completely deprecate base64 entirely for brand new standards? Or is
> > it solely that base64URL also "forbids line separators"? Is this the
> > only reason why the older base64 is still used in new standards? Or am I
> > missing something?
>
> It's worse than that... there are 11 variants for base64 (and some even
> crazier variants that Digital Bazaar has seen in the wild):
>
> https://en.wikipedia.org/wiki/Base64#Variants_summary_table
>
> Base64 is a mess specifically because of all of the optionality... we
> shouldn't pull it forward into new standards unless only *one* encoding
> of base64 is settled upon for very specific use cases.
>
> Even base64url has two variants (a padded version and an unpadded
> version). The Multibase spec recognizes the 4 most widely used base64
> formats:
>
>
> https://tools.ietf.org/id/draft-multiformats-multibase-00.html#rfc.appendix.D.1
>
> In any case, boo base64 (for 99% of use cases where you're encoding
> something less than a few hundred bytes in size). For stuff that's
> larger, like PNG images... base64pad all the way.
>
> *grabs popcorn, awaits encoding-format related Chair throwing*
>
> -- manu
>
> --
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
>
>

-- 
*ORIE STEELE*
Chief Technical Officer
www.transmute.industries

<https://www.transmute.industries>

Received on Sunday, 26 April 2020 16:34:03 UTC