Re: New Work Item Proposal: Data Integrity BIP340 Cryptosuite

On Sun, Aug 3, 2025 at 11:19 AM Will Abramson <wip.abramson@gmail.com>
wrote:

> Jaromil, I would also like to know more about your perspective. To me,
> Multikey is just a way of encapsulating the bytes representing a key so
> they can be resolved and consistently understood within DID documents
> across multiple implementations. It is about creating an interface between
> your internal system and the external world. As soon as you ingest that
> key, you can transform it to whatever format or representation that suits
> your internal system best.
>

I agree with much of your framing — that Multikey encapsulates key bytes to
enable consistent resolution across DID methods and implementations. It
creates a standard boundary between external representation and internal
processing. Once a key is ingested, implementers can transform it into
whatever internal format suits their system.

However, I believe that both Multikey and Multihash suffer from a
significant architectural flaw: they bind type to encoding, violating
proper layering principles.

As part of our work at Blockchain Commons, particularly on the Gordian
architecture, we explored a wide range of encoding strategies. One relevant
reference is our paper on binary URI compatibility (
https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-003-uri-binary-compatibility.md),
which highlights these concerns — especially in the context of constrained
encoding environments like QRs.

A core lesson from that research is that entangling semantic type with
encoding format introduces long-term complexity. It expands attack
surfaces, increases implementation errors, and makes future-proofing much
harder.

Bitcoin's experience offers a clear example:

In the original Base58Check format, a version byte (e.g., 0x00 for P2PKH or
0x05 for P2SH) is prepended to the payload. This version byte influences
the first character of the resulting address (1 or 3, respectively),
leading many to call it a “prefix.” But in truth, it's an encoding artifact
— not an intentional or self-contained prefix.

Later, Bech32 introduced a different strategy: the “human-readable part”
(bc1). This further embedded semantics into the encoding format itself.
While useful in some ways, this approach led to a proliferation of address
formats and eventually the need for descriptors to properly communicate
address meaning. Today, address type is completely decoupled from the keys
they derive from.

These challenges all stem from an initial decision to embed type
information into the encoding — a move that seemed efficient early on, but
created serious long-term issues of maintenance, compatibility, and attacks.

Multikey replicates this same pattern. Like Bitcoin addresses, it embeds
type information (such as key algorithm) directly into the encoded
representation. This tight coupling between type and encoding may be
convenient for some use cases, but it risks repeating the same layering and
extensibility problems we've already encountered.

At Blockchain Commons, we’ve taken a different approach in our UR standard (
https://github.com/BlockchainCommons/Research/blob/master/papers/bcr-2020-005-ur.md),
now adopted by 18+ wallets for animated QRs. We fully separate type from
encoding. This gives us the flexibility to choose optimal encodings for
different contexts — whether for constrained QR character sets (which
perform best with about 36 characters), human readability, or compact
binary transport — without hardcoding semantics into the serialization
layer.

I’d love to hear others’ thoughts on how we might better preserve layering
and future-proof our encoding strategies in these specs.

-- Christopher Allen

Received on Sunday, 3 August 2025 18:28:45 UTC