Re: [EXT] Re: LS from GSMA EIG to W3C from Orie Steele on 2023-09-17 (public-vc-wg@w3.org from September 2023)

From: Orie Steele <orie@transmute.industries>
Date: Sun, 17 Sep 2023 11:32:56 -0500
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: Markus Sabadello <markus@danubetech.com>, W3C VC Working Group <public-vc-wg@w3.org>, Wayne Cutler <wcutler@gsma.com>, Greg Bernstein <gregb@grotto-networking.com>
Message-ID: <CAN8C-_JESNKwOYzfSez6feapKuzn8cuFg0VsnXUKoNDqxoe-Dg@mail.gmail.com>
I've shared criticism of these advances previously, and I evaluated them
when they were first published,
generalizing di-ecdsa-sd to di-jws*-sd, which made it possible to use any
IANA registered algorithm with the same approach (including RSA).

https://github.com/transmute-industries/vc-di-sd#examples ( 3 months ago,
so its reasonable that folks forgot I have knowledge and awareness of the
topic ).

The approach was interesting to explore, my conclusion was it's not ready
or a good idea to standardize, I said so on this list.

I've yet to see any of the new "di-sd primitive work" be applied directly
to BBS, but I know Greg is working on it... AFAIK, he's the current expert
on this approach.

See below regarding why I believe its not sufficient:

Unlinkability requires verifiers to not have any way to correlate claims to
the same holder.

Data Integrity with blank node blinding by itself does not achieve this.

https://github.com/digitalbazaar/di-sd-primitives/blob/main/test/labelReplacementCanonize.spec.js
https://github.com/digitalbazaar/ecdsa-sd-2023-cryptosuite/commit/73c87b7470bec6c569adc2cacf2c66bcf5353055
https://dlongley.github.io/decanonicalization/ (giving blinded names to
blank nodes does not change this picture... it makes it worse, by adding
the cost to hmac each blank node, the picture also gets worse if you
upgrade the hash function from sha-256 to sha-384, or better, to comply
with
https://media.defense.gov/2022/Sep/07/2003071834/-1/-1/0/CSA_CNSA_2.0_ALGORITHMS_.PDF
).

Consider a verifier who learns:

'_:uXqefD0KC4zrzEbFJhvdhYTGzRYW3RhjcQvfkpkWqDpc <urn:example:dateOfBirth>
\"01-01-1990\" .\n',

and another verifier who learns:

'_:uXqefD0KC4zrzEbFJhvdhYTGzRYW3RhjcQvfkpkWqDpc
<urn:example:documentIdentifier> \"T21387yc328c7y32h23f23\" .\n',

When they collude (assuming they both understand RDF) they both learn, the
other saw "_:uXqefD0KC4zrzEbFJhvdhYTGzRYW3RhjcQvfkpkWqDpc"

Later they might join that value to a DID, GTIN, SGTIN, GLN, vLEI or SSN...
or a threat actor who dumps both their databases might do the same thing.

If the assertion is that VC-DI-BBS uses batches like SD-JWT to achieve
unlinkability, that probably works, but thats not the same thing as having
a single credential representation that can be presented over and over
again without linkability.

The structure of the claims and the envelope interfere with the
unlinkability objective, if you only see claims as "things produced from
RDF canonicalization", you are trying to build a race car out of wood....
there is only so much that wood can do, and there are much better materials
available in the year 2023... Then again, some designs are beautiful even
when they don't make tremendous sense:

https://www.designboom.com/technology/worlds-lowest-car-fiat-panda-carmagheddon-cut-half-road-legal-vehicle-07-03-2023/

I don't have evidence that anyone is actually using the RDF triples, and
even if the verifier does process them, this transformation is wasting C02
and time on every sign and verify, which is massively amplified when the
standard is adopted in high volume use cases (imagine every single package,
shipment, or checkpoint, entering or exiting a country).

These RDF transformations also increase the attack surface and reduce the
potential off the shelf implementations and vendors that can support
scaling out the technology and ensuring it delivers its value to users.

When considering critical supply chains, its not just the wasted C02 and
energy time that come along with these design choices, its also the attack
surface for threat actors, the maintenance and complexity cost for
implementers, and the market for experts / support on the application of
the bundled technology.

It's the equivalent of saying you like keeping your house at 90 degrees in
the winter and 70 degrees in the summer, instead of just putting on or
taking off a sweater... You know the option to use less is there, but other
values (JSON-LD information model, or personal comfort) come before your
commitment to reducing energy use and complexity... With scale, small
inefficiencies become large cost centers, advocacy is the key to countering
misinformation and influence operations... but its a thankless and tiring
job.

The essence of W3C debate is the priority of the constituencies, I feel
strongly that the VCWG is failing to tackle this issue with the severity
that it deserves:
https://www.w3.org/TR/design-principles/#priority-of-constituencies ...
https://w3c.github.io/sustyweb/#benefits-27

Some W3C members might always recommend mandating unsustainable approaches
in identity and credentials spaces (see also Bitcoin based identity
methods).

The best way to tackle this is with sustained technical debate, evidence
based reasoning, and rigorous tests that show energy time used, power
consumption, storage costs, privacy / censorship capabilities, etc...

Where the inputs to the test reflect production use cases, documents with
100's or 10,000's of line items in the supply chain context, and maybe
trivially small credentials in the context of personal identity documents
like drivers licenses, or diplomas.

I've not seen or written tests proving unlinkability in bbs, so I don't
(yet) believe it can be achieved with the current proposed tooling... and
adding RDF to a multi-message zero knowledge selective disclosure scheme
when it does nothing other than provide application layer transforms that
are brittle to the input and implementation... its not a good idea in my
opinion.

If we see such a test with DI-BBS, anyone will instantly be able to improve
performance, by dropping the application layer processing at sign and
verify time (even for the small credentials)... Which is why I keep saying
RDF transforms are not necessary or helpful to achieve unlinkability and
selective disclosure.

The benefit would be similar to what you see here:
https://dlongley.github.io/decanonicalization when switching from signing
the canonical (expensive to compute) form of the input, to just signing the
bytes (as the input)... SD-JWT is also bad about this, because it MUST
traverse the claims maps, to encode disclosable claims... its just a less
expensive, and simpler to implement processing, and it does not cross media
type boundaries.

Obligatory shout out to my friend Dan at Chainguard:
https://twitter.com/lorenc_dan/status/1489426543516471301

They are doing excellent work securing software supply chains:

-
https://www.dhs.gov/science-and-technology/news/2023/04/27/st-forms-new-startup-cohort-strengthen-software-supply-chain-visibility-tools

Also maybe review the notes from XML Signatures in 1999!

Most of the commentary still applies to the approach being taken today with
Data Integrity Proofs, its just JSON-LD instead of XML.

- https://www.w3.org/Signature/Minutes/DC-Minutes/Overview.html

Regards,

OS
Received on Sunday, 17 September 2023 16:33:14 UTC