Re: JSONWebSignature2020 vs JcsEd25519Signature2022

From: Manu Sporny <msporny@digitalbazaar.com>
Date: Sat, 28 Jan 2023 14:39:00 -0500
To: Orie Steele <orie@transmute.industries>
Cc: Tomislav Markovski <tomislav@trinsic.id>, Markus Sabadello <markus@danubetech.com>, "W3C Credentials CG (Public List)" <public-credentials@w3.org>, public-vc-wg@w3c.org, silverpill@firemail.cc
Message-ID: <CAMBN2CS83t_-GfRk_NTUndRN3m8faEYFtLv1xEh=qY_H_R+HrA@mail.gmail.com>

On Fri, Jan 27, 2023 at 9:50 PM Orie Steele <orie@transmute.industries> wrote:
> Indeed, it could be a parameter of the crypto suite... We built an experimental merkle proof scheme with parameterized canonicalization a while back:
> Costly in implementation complexity, interoperability and compute time...

Yes, in other words, don't do it (for the reasons stated in the
previous email). :)

> It's true JCS only protects JSON... But when that's all you need, why ship an RDF library in the binary?

JCS canonicalization only covers the JSON use cases, there are more
use cases than just focusing on JSON.

RDF canonicalizations covers the JSON ones, and all the other use
cases (see below).

> Though I agree that JCS is not needed when application/credential+ld+json is protected with JOSE or COSE.

... and if you don't do any JSON-LD processing or RDF canonicalization
there, you have no idea if you're signing garbage or not. People will
sign garbage if we allow them to do so, that's the danger with that
path.

> Content addressing is most compelling use case for canonicalization...

There are other compelling use cases for information model canonicalization...

> Canonicalization is not a requirement for verifiable credentials, in my opinion...

Perhaps not for your use cases, but RDF canonicalization has been
critical to our use cases, including the digital age verification
system that's launching nation-wide in the US (TruAge), that uses RDF
canonicalization for the digital age tokens, which allowed us to
losslessly convert from a verbose JSON representation (VC) to a
compact CBOR representation and back again without having to digitally
sign the data twice.

Turns out that was a critical feature, and it was not solved by JCS.
It was the only way we could get a digitally signed VC compressed to a
QR Code that could be read by 99%+ of optical scanners at point of
sale systems across the 149,000+ retail stores across the US.

There are other use cases, such as:

* Signing the information model instead of the syntax
* Selective hashing and disclosure
* Cryptographic hyperlinking between information models
* Syntax-agnostic digital proofs
* Nested digital proofs
* Chained digital proofs

I'll stop there, but for those of you that are new to this
conversation... this is not an apples to apples comparison.

> Canonicalization might be a requirement for data integrity proofs, if it is, it will remain a reason they are slower and more vulnerable to supply chain attacks.

Sounds like FUD to me. Citation required.

How much slower? Under what circumstances? Vulnerability to supply
chain attacks -- please back up your assertions.

I'll provide some background on ours:

* One of the systems we are deploying into production easily does 50M
transactions per day, every single day, uses RDF Dataset
Canonicalization, and does not seem to be affected by the performance
limitations that you state above. JSON Schema processing and base64
encoding/decoding takes as much, if not more, of the total CPU
budget... and in both cases, is negligible compared to network
transmission speeds.

* We've found that most people that claim that "JSON-LD is slower" or
"RDF canonicalization is hard / expensive" have not implemented or
used these libraries at scale, have misimplemented something obvious,
or are cherry picking degenerate cases to make their point. So, if
you're going to claim something, please back it up with some data,
like so (as you can see, this JSON-LD is too slow nonsense has been
around for a while):

http://manu.sporny.org/2016/json-ld-context-caching/

* The supply chain attacks that we're concerned about have more to do
with the enormous reams of libraries used above the cryptography layer
-- the ones that concern us the most are the higher-level libraries
that can exfiltrate data, steal PII, and are not under good release
management upstream. People tend to spend A LOT more time making sure
canonicalization and cryptography libraries are secure and audited and
far less time on the higher bits of code that actually manage PII and
other in-memory secrets. Focusing on a a well audited canonicalization
or cryptography library and ignoring the thousands of other packages
that are pulled into a modern software system suffers from misplaced
priorities. Yes, there are tradeoffs, but the decision has more
factors than "JCS good, URDNA bad".

All that said, if you don't need it for your use case, don't use it...
but also don't suggest that there aren't other valid use cases that
require the additional feature set, or that there are no trade-offs
when you "do the simple thing".

Engineering is about trade-offs, and we should be conveying what those
trade-offs are and when it's appropriate to make them.

-- manu

--
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
News: Digital Bazaar Announces New Case Studies (2021)
https://www.digitalbazaar.com/

Received on Saturday, 28 January 2023 20:02:28 UTC