W3C home > Mailing lists > Public > public-credentials@w3.org > February 2022

Re: CBOR-LD for VC

From: Orie Steele <orie@transmute.industries>
Date: Tue, 15 Feb 2022 09:08:22 -0600
Message-ID: <CAN8C-_+HvPmGL5F6p6HUW0o6WpkMaRg3N+i0qFc8nxRm3t8_MA@mail.gmail.com>
To: Manu Sporny <msporny@digitalbazaar.com>
Cc: "W3C Credentials CG (Public List)" <public-credentials@w3.org>
You might also want to read:

Most of the folks who are against "canonicalization" are probably still in
favor of not creating more Shell Injection, SQL Injection or XSS vectors.

See also

> Note that the resulting encoded token is different from the first example
using io.jwt.encode_sign_raw. The reason is that the io.jwt.encode_sign
function is using canonicalized formatting for the header and payload
whereas io.jwt.encode_sign_raw does not change the whitespace of the
strings passed in. The decoded and parsed JSON values are still the same.

Canonicalization is a source of complexity, and there are plenty of cases
where that complexity is either "worth it" or "not".

Saying all canonicalization is bad is like saying that washing your hands
after using the bathroom is bad because most of the time you don't get sick.

Building secure systems requires us to look at low probability events and
imagine an attacker making them higher probability events.


On Tue, Feb 15, 2022 at 8:38 AM Manu Sporny <msporny@digitalbazaar.com>

> On 2/14/22 12:48 AM, Anders Rundgren wrote:
> > Continuing the CBOR thread but now with dedicated subject line. I'm not
> > much into "LD" but obviously you should be able to create a CBOR-LD.
> For those of you that are not aware, Anders fought bravely at IETF for
> close
> to a decade to get the JSON Canonicalization Scheme published as an RFC:
> https://datatracker.ietf.org/doc/html/rfc8785
> As he mentions elsewhere in the thread, there are people at IETF that are
> strongly against doing any form of data canonicalization (taking input data
> and formatting it into a standard format). Those same people tend to block
> any
> progress on work of that nature in any IETF WG.
> I was shocked (in a good way) when Anders became successful in publishing
> RFC8785 becoming the first person to get a generalized canonicalization
> scheme
> through IETF in 20+ years. I've been wanting to sit down with him for
> years to
> hear the story about how he accomplished that:
> https://datatracker.ietf.org/doc/html/rfc8785
> ... and that's for the simplest type of canonicalization scheme.
> As many of you know, the Data Integrity work (was: Linked Data Signatures),
> has two canonicalization schemes that are used... JCS (RFC8785) and
> URDCA2015
> (RDF Dataset Canonicalization -- which uses JCS for canonicalizing JSON
> data).
> The W3C is going to pick up standardizing that work in the next couple of
> months (once the charter votes happen).
> > The only real stumbling block I have found is that the "Guardians of
> >  consider URLs as type identifiers a bad thing because: - The intention
> > was (and is) that you register application-specific nnn() tags with IANA
> -
> > URLs open the possibility reading CBOR schemas in run-time which is a
> known
> > XML foot-gun
> Yep, that's the same centralized thinking that has been common at IETF for
> well over two decades now. "Innovate through us." being the subtext. While
> it
> has gotten the Internet to where it is today, it's not a good recipe for
> decentralized innovation (which admittedly has it's own pitfalls).
> Everything runs through centralized registries that have gatekeepers that
> stop
> canonicalization work due to legitimate scars caused by XML
> Canonicalization
> (which is a very different problem, but it's nearly impossible to have a
> rational conversation about it at IETF).
> > Decentralized URLs as type identifiers are (IMO) a necessity for a lot
> of
> > systems. Regarding reading schemas in run-time: there will always be
> people
> > who do not understand how to write secure software but will do it anyway.
> Yep. The solution there has always been simple: "DO NOT load schemas at
> runtime in production systems. Ship software with the schemas locked down."
> > As I wrote in another thread, using COSE signatures (or encryption) is
> > something I wouldn't do.  Using COSE public key and algorithm
> identifiers
> > is though perfectly workable.
> I'd be interested in hearing more about why you think this, Andrews.
> > Regarding possible COSE-LD signatures I would consider a solution where
> > signatures only protect the actual bytes transferred, and feature the LD
> > part as a hash.  That is, validation of LD canonicalization would be an
> > optional step.
> Now there is a really compelling idea (for those working in CBOR). I wish I
> had more time to chase that idea down. My instinct says that the problem
> with
> that approach is that you have to fully commit to working in CBOR. The
> approach doesn't require you to commit to working in JSON-LD or CBOR-LD...
> you
> can switch between the two w/o having to re-do all of your digital
> signatures.
> There is now some work happening on YAML-LD[1], which is going to be able
> to
> benefit from all the benefits/specs behind JSON-LD or CBOR-LD once it
> becomes
> a thing in the world.
> This means that people can losslessly convert graph-based data (Verifiable
> Credentials, social networks, supply chain dependencies, etc.) between
> CBOR, and YAML... including all the digital signatures w/o having to
> re-sign
> the data. That's the real power of canonicalized forms.
> -- manu
> [1]https://github.com/ietf-wg-httpapi/mediatypes/issues/8
> --
> Manu Sporny - https://www.linkedin.com/in/manusporny/
> Founder/CEO - Digital Bazaar, Inc.
> News: Digital Bazaar Announces New Case Studies (2021)
> https://www.digitalbazaar.com/

Chief Technical Officer

Received on Tuesday, 15 February 2022 15:08:47 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:25:28 UTC