Re: RDF Dataset Canonicalization - Formal Proof from Anders Rundgren on 2021-03-30 (public-credentials@w3.org from March 2021)

From: Anders Rundgren <anders.rundgren.net@gmail.com>
Date: Tue, 30 Mar 2021 05:03:03 +0200
To: Orie Steele <orie@transmute.industries>
Cc: "W3C Credentials CG (Public List)" <public-credentials@w3.org>
Message-ID: <5f989fd5-d84f-5d3d-96b1-5d87a14a0879@gmail.com>
On 2021-03-29 20:25, Orie Steele wrote:
<snip>
> We should be similarly concerned about RDF Canonicalization.
> 
> There is a cost to it, but it's less so on storage, and more so on computation.
> 
> Alternatives have been raised before such as JCS, but they are not really doing the same thing, JCS is in a sense just like JOSE without the 33% bloat for base64url encoding.

The primary motivation behind JCS (RFC8785) was to enable signatures to be added to JSON objects rather than burying objects in specific signature containers.

Here is a recent proposal to W3C's most recent quest (building a payment application into the browser itself!) which uses JCS in multiple places: https://fido-web-pay.github.io/

It is not clear how you can combine a JS-API delivering unsigned payment data with outgoing signatures without having some kind of canonicalizer.

Regards
Anders
> 
> Ultimately this is a question of the most efficient computation and storage mechanism for semantically unambiguous cryptographically verifiable information.
> 
> I think URDNA2015 does a great job, and is worth formalizing, however I want to peer into the future and ask what is our ideal solution to this problem?
> 
> I dream of a binary replacement for URDNA2015 where semantics and encodings can be retained, while graph representations can be minimized unambiguously.
> 
> perhaps some form of nquads + multicodec + cbor-ld.
> 
> Regards,
> 
> OS
> 
> 
> 
> 
> 
> On Mon, Mar 29, 2021 at 12:16 PM Dave Longley <dlongley@digitalbazaar.com <mailto:dlongley@digitalbazaar.com>> wrote:
> 
> 
>     Tobias,
> 
>     One idea with the LD signature redaction suite that never took off was
>     to have the issuer generate an HMAC key and use that to generate the
>     salts -- and then give the holder the HMAC key so they can do the same
>     when sharing.
> 
>     On 3/28/21 4:33 PM, Steve Capell wrote:
>      > Hi Tobias
>      >
>      > Good questions - which I’ve forwarded to the Singapore team for an
>      > authoritative answer
>      >
>      > Here’s my non-authoritative attempt
>      > - salts are an array of uuids I think -
>      > see https://edi3.org/specs/edi3-notary/develop/#611-salting-the-data <https://edi3.org/specs/edi3-notary/develop/#611-salting-the-data>
>      > - signature correlation - not sure but I’d mention that all use cases
>      > for this approach so far are for cross border trade documents where the
>      > subject is a public identifier such as a business number.  The design
>      > intent is that the identity is correlatable.
>      > - we haven’t noticed performance issues of any significance but we are
>      > talking volumes of only a few million per year
>      >
>      > Steven Capell
>      > Mob: 0410 437854
>      >
>      >> On 28 Mar 2021, at 2:53 pm, Tobias Looker <tobias.looker@mattr.global>
>      >> wrote:
>      >>
>      >> 
>      >> > I’m a big fan of this approach, a form of redaction distinct from zk
>      >> forms of selective disclosure.
>      >>
>      >> > There was an attempt to spec one here in the CCG three-four years
>      >> ago, but it died on the vine.
>      >>
>      >> I'm also interested in learning more about this approach too, the
>      >> questions I had last time were
>      >>
>      >> 1. How the salt for each redactable statement would be managed in a
>      >> way that would not leak the abstraction that "Linked Data Proofs" sets
>      >> up. For example would the attached proof block have to have a long
>      >> array of salts?
>      >> 2. Proof sizes, having to have a salt per-statement signed as a part
>      >> of the proof would significantly increase the size of the proofs
>      >> representation.
>      >> 3. Signature correlation, perhaps not important in this scheme, but I
>      >> think the approach would require revealing a fixed signature
>      >> regardless of which parts are redacted from the original proof?
>      >> 4. Performance? Also perhaps a non-issue but if anyone has
>      >> info/benchmarks around how the scheme might scale with the size of the
>      >> data graph signed, that would be great to look at?
>      >>
>      >> Thanks,
>      >> Mattr website <https://mattr.global <https://mattr.global>>
>      >> *Tobias Looker*
>      >> Mattr
>      >> +64 (0) 27 378 0461
>      >> tobias.looker@mattr.global <mailto:tobias.looker@mattr.global <mailto:tobias.looker@mattr.global>>
>      >> Mattr website <https://mattr.global <https://mattr.global>> Mattr on LinkedIn
>      >> <https://www.linkedin.com/company/mattrglobal <https://www.linkedin.com/company/mattrglobal>>       Mattr on Twitter
>      >> <https://twitter.com/mattrglobal <https://twitter.com/mattrglobal>>    Mattr on Github
>      >> <https://github.com/mattrglobal <https://github.com/mattrglobal>>
>      >>
>      >>
>      >> This communication, including any attachments, is confidential. If you
>      >> are not the intended recipient, you should not read it - please
>      >> contact me immediately, destroy it, and do not copy or use any part of
>      >> this communication or disclose anything about it. Thank you. Please
>      >> note that this communication does not designate an information system
>      >> for the purposes of the Electronic Transactions Act 2002.
>      >>
>      >>
>      >> On Sun, Mar 28, 2021 at 3:49 PM Christopher Allen
>      >> <ChristopherA@lifewithalacrity.com <mailto:ChristopherA@lifewithalacrity.com>
>      >> <mailto:ChristopherA@lifewithalacrity.com <mailto:ChristopherA@lifewithalacrity.com>>> wrote:
>      >>
>      >>     On Sat, Mar 27, 2021 at 7:22 PM Steve Capell
>      >>     <steve.capell@gmail.com <mailto:steve.capell@gmail.com> <mailto:steve.capell@gmail.com <mailto:steve.capell@gmail.com>>> wrote:
>      >>
>      >>         The Singapore government https://www.openattestation.com/ <https://www.openattestation.com/> does
>      >>         this already . Version 3 is W3C VC data model compliant
>      >>
>      >>         Each element is hashed (with salt I think) and then the hash
>      >>         of the hashed is the document hash that is notarised
>      >>
>      >>         The main rationale is selective redaction (because the root
>      >>         hash is unchanged when some clear text is hidden). But I
>      >>         suppose it simplifies canonicalisation too...
>      >>
>      >>
>      >>     I’m a big fan of this approach, a form of redaction distinct from
>      >>     zk forms of selective disclosure.
>      >>
>      >>     There was an attempt to spec one here in the CCG three-four years
>      >>     ago, but it died on the vine.
>      >>
>      >>     I’d be interested is seeing this spec & implementation. Any links?
>      >>
>      >>     — Christopher Allen [via iPhone]
>      >>
>      >>
>      >> This communication, including any attachments, is confidential. If you are not the intended recipient, you should not read it - please contact me immediately, destroy it, and do not copy or use any part of this communication or disclose anything about it. Thank you. Please note that this communication does not designate an information system for the purposes of the Electronic Transactions Act 2002.
> 
> 
>     -- 
>     Dave Longley
>     CTO
>     Digital Bazaar, Inc.
> 
> 
> 
> -- 
> *ORIE STEELE*
> Chief Technical Officer
> www.transmute.industries
> 
> <https://www.transmute.industries>
Received on Tuesday, 30 March 2021 03:04:21 UTC