Re: Longfellow ZK (google-zk)

On Sat, Jun 28, 2025 at 6:46 AM Matteo Frigo <matteof@google.com> wrote:
> As you have surmised, the circuit verifies two ECDSA signatures, one from the issuer on the hash of the MDOC, and the other from the secure element on the transcript of the session between the relying party and the holder (basically a nonce provided by the RP and some other stuff).

Great, thank you for confirming that, Matteo. Good, so we have the
ability to do cryptographic counter-signatures in zero knowledge,
which is a base level requirement for many use cases.

> The way things are currently set up is this.  Pick SZ and N in advance, say SZ=2048 and N=3.  We generate a circuit that takes a MDOC up to SZ bytes long and discloses up to N attributes.

Ah, ok, so that means that these circuits have to be pre-compiled and
can't be done at runtime. One of the benefits of using W3C VCs with
BBS is that we don't have to pre-compile anything, but can dynamically
generate all material necessary to verify the BBS derived proof. The
downside being that the approach isn't post-quantum resistant in the
same way that ECDSA is not post-quantum resistant. The benefit of
LongfellowZK is that it's possible for it to be post-quantum
resistant, though the very large signature sizes pose a significant
challenge, IIUC?

I will note that the SZs and Ns being chosen to date are quite small.
They're workable, certainly, but it sounds like this technology
doesn't scale well to VCs or MDOCs that are larger than a few KBs...
and, most every VC and MDOC is much larger than that. We can jump
through hoops to make the sizes small, and there is active work
(vc-barcodes, CBOR-LD) that is capable of compressing VCs that are a
few KBs in size down to a few hundred bytes in size, which contain
10-15-ish attributes.

All this to say, LongfellowZK sounds like it has some significant
pre-processing that would need to be done to apply the technology more
generally, and even if we do that, we're back to where we were with
the previous generation of unlinkable solutions (CL-Signatures) where
you needed to pre-compile these cryptographic circuits.

This is a step backwards, IMHO, but not one that's unworkable. What we
might try to mitigate this challenge is the issuer signing the
cryptographic circuit so that we don't have to agree to industry-wide
cryptographic circuits (which would have severe scaling difficulties).

We have VCs/mdocs that embed images, like profile photos, that end up
being ~100KB-200KB in size. These contain about 35 attributes per
credential. Is there an easy way for us to determine how much
evaluation of that sort of circuit might take? I did see that you
noted that we might have to create different circuits, a fast one w/ a
few KB SZ and 2-3 N (attributes), and a slow one with up to 35
attributes.

> Most of the proving cost is in the SZ term, because we have to hash SZ bytes.  The N term is smaller, say something like 30ms for each extra attribute.

Ok, so we're talking roughly 1 extra second of compute for 35
attributes. How much extra compute are we talking for a 100KB or 200KB
input? I'm not going to hold you to these numbers, of course, we're
just ballparking it right now.

> We do equality-check only because we lack imagination.  If use cases arise that require, e.g., less-than comparisons, we can easily add those at negligible cost.  I think the fullness-of-time circuit will look like N "arithmetic-logic units" that can be configured to compute a few common operations, and N giant multiplexers that route the MDOC bytes to the input of the ALU.  Today the ALU is just an equality check.

Understood... and the analogy to these functioning as ALUs has greatly
helped my ability to  (very generally) understand how these
cryptographic circuits operate.

> Agree.  That's the next thing to work on---how to verify a post-quantum signature in a circuit.  This is a hard problem.

Can you elaborate on why it's a hard problem? Does it have to do with
the complexity of the algorithms, or the size of the input, both, or
something else entirely?

> I note that, from the ZKP perspective, SD is redundant because it accomplishes the same goal as ZKP in a less perfect way.  So it's better for us to start from plain JWT rather than having to compute a redundant hash in SD-JWT.  MDOC is SD-only, so it is suboptimal from our perspective.

Yes, my read is the same. SD-JWT and mdoc are suboptimal because they
force the data encoding layer into the cryptographic layer and do not
allow for a proper separation of concerns or architectural layering.
They are specialized data formats that did not take zero knowledge
cryptography into consideration and you, therefore, need to twist
yourself all around and add hacks to get them to work with
cryptographic circuits. I'll argue that even the JWT stuff you're
looking at is probably inefficient as well (all that base
encoding/decoding, nested objects, etc. are inefficiencies).

This is why we avoided using them for the Verifiable Credential + Data
Integrity work... we wanted more control over the inputs to the
cryptographic layer and created an abstraction that would allow
transforms of the data into the cryptographic layer so that we could
get to more properly architected, and therefore, higher-performance
solutions both in data size and compute. The VC Data Integrity work
allows us to transform (compile) the input credential into the ideal
binary format for input into the cryptographic circuit, so we don't
have unecessary and expensive logic embedded in the circuit (base
decoding, JSON parsing, multiple JSON levels, string processing,
etc.).

> > Circuit compression.
>
> As you have surmised, there is a lot of redundancy in the circuit.
> I am sure one can do better but 300KB is in the noise.

Well, it's in the noise for use cases that have high-bandwidth
communication channels. We have other use cases where our byte budget
for transmission is below 2KB, which I know doesn't work at all for
the LongfellowZK stuff -- even if you can move the cryptographic
circuit transmission to a  high bandwidth channel, the proofs are
still quite large and not compressible.

> If the whole circuit were just repeating SHA256 36 times we would be done, but things get much more complicated when we have to select the N attributes, where now the attributes are misaligned with respect to the SHA256 vector dimension.  This is kind-of analogue to AVX programming, where you have a vector unit that performs 8 additions but you still need to perform scalar operations.  Any attempts to exploit vector operations leads to complexity that we are unwilling to impose to the world at this stage.  We would also need some circuit-description language and compiler support which we don't currently have.

Interesting... I wonder if this "alignment" step could be done as a
pre-processing step for the VC Data Integrity approach. Unlike SD-JWT
and MDOC, with VCs and Data Integrity we have the flexibility of
transforming the input to the cryptographic circuit, so if we could
rearrange the input such that it's naturally aligned with the
cryptographic circuit, maybe that would provide some benefit? I'm
really far beyond my area of expertise and am just using your
analogies to ALUs, AVX/SIMD to guess at what we might do to make the
input to the cryptographic circuit more efficient.

Thank you for your response, Matteo -- it helped clear up even more.
As you know, we're actively incubating approaches that use
cryptographic circuits for VCs and Data Integrity, and we're trying to
decide what the optimal input would be to a cryptographic circuit and
then build the solution around that, rather than the current approach
of starting with inefficient input formats (SD-JWT and MDOC) and then
trying to twist ourselves into pretzels to get something that only
works for a small subset of the use cases.

-- manu

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
https://www.digitalbazaar.com/

Received on Saturday, 5 July 2025 15:27:37 UTC