- From: Daniel Hardman <daniel.hardman@evernym.com>
- Date: Fri, 24 Apr 2020 13:55:36 -0600
- To: Christopher Allen <ChristopherA@lifewithalacrity.com>
- Cc: Credentials Community Group <public-credentials@w3.org>
- Message-ID: <CAFBYrUrQNDzTymRfGZSdsQsP=rYYuXgUaAiXTZ=Nm04DPQVOww@mail.gmail.com>
One of the problems with base64url is that there are variants, so saying "base64url" doesn't answer all questions. RFC 4648 is slightly unclear about padding; it says it may be omitted in section 5 <https://tools.ietf.org/html/rfc4648#section-5> "if the data length is known implicitly," but then links to section 3.2, which says "In some circumstances, the use of padding ("=") in base-encoded data is not required or used. In the general case, when assumptions about the size of transported data cannot be made, padding is required to yield correct decoded data. Implementations MUST include appropriate pad characters at the end of encoded data unless the specification referring to this document explicitly states otherwise. The base64 and base32 alphabets use padding." Based on this ambivalence, libraries in various programming languages have divergent behaviors with respect to base64url padding. See this discussion on python dev <https://bugs.python.org/issue29427> lists, which in turn references one on ruby. It's not hard to write an algorithm that accepts all base64url variants, but not all of them do. And some emit padded by preference; others emit unpadded. JWS requires unpadded <https://tools.ietf.org/html/rfc7515#appendix-C>, I believe. I suspect that this muddiness is one reason why the URL-safe variants haven't been adopted more crisply. But I do think we'd be better off using it wherever we can. On Fri, Apr 24, 2020 at 1:27 PM Christopher Allen < ChristopherA@lifewithalacrity.com> wrote: > I'm sure that the standards-based data encoding formats geeks among us > already knew this, but here is a TILT (Thing I Learned Today) that somehow > I never quite internalized, but raises a question. > > The character set used in the base64 specification [RFC4648] collide with > the URI reserved characters [RFC3986], thus there is a variant called > Base64URL also defined in [RFC4648] that doesn't collide with URI reserved > characters. > > Replaces “+” by “-” (minus) > Replaces “/” by “_” (underline) > Does not require a padding character > > But the question I have then is, why use the older base64 at all? Why not > completely deprecate base64 entirely for brand new standards? Or is it > solely that base64URL also "forbids line separators"? Is this the only > reason why the older base64 is still used in new standards? Or am I missing > something? > > — Christopher Allen > >
Received on Friday, 24 April 2020 19:56:03 UTC