RE: RDF Dataset Canonicalization - Formal Proof from Maillet LAURENT (GOVTECH) on 2021-03-29 (public-credentials@w3.org from March 2021)

From: Maillet LAURENT (GOVTECH) <Maillet_LAURENT@tech.gov.sg>
Date: Mon, 29 Mar 2021 05:47:05 +0000
To: "tobias.looker@mattr.global" <tobias.looker@mattr.global>, "steve.capell@gmail.com" <steve.capell@gmail.com>
CC: "public-credentials@w3.org" <public-credentials@w3.org>
Message-ID: <9f936940e7c74cfea9fe57145b28c41f@tech.gov.sg>
1. How the salt for each redactable statement would be managed in a way that would not leak the abstraction that "Linked Data Proofs" sets up. For example would the attached proof block have to have a long array of salts?
We currently have 2 versions of OpenAttestation (OA), one compatible with W3C one not:

  *   W3C NOT compatible version: Salts are not leaked into the proof. The credential is transformed and will self-contain the data needed to enable data obfuscation. Redacted data are moved into the signature (in the privacy object). The process is document here: https://github.com/Open-Attestation/adr/blob/master/selective_disclosure.md

  *   W3C compatible version (still WIP): the salts are added into the proof, however the salt is removed along with the data, The process is similar to the one above. Just the place where the information are stored is different. By removing the salt when the data is redacted, we ensure no leakage. (We don’t support proof chains / proof sets so far)

2. Proof sizes, having to have a salt per-statement signed as a part of the proof would significantly increase the size of the proofs representation.
The single document identifier (target hash) does not grow in size. However, each data needs a unique salt. So the overall proof will grow indeed.

3. Signature correlation, perhaps not important in this scheme, but I think the approach would require revealing a fixed signature regardless of which parts are redacted from the original proof?
If we take into account the full signature (the object), it actually slightly changes. For instance by removing salts or adding the hash of redacted data. If we only talk about the signature value (the final hash, aka target hash) it remains the same.

4. Performance? Also perhaps a non-issue but if anyone has info/benchmarks around how the scheme might scale with the size of the data graph signed, that would be great to look at?
We made performance testing a while ago. Unfortunately we haven’t recorded anything (shame on us) and I don’t recall clearly the result. But it scaled not badly. So far we haven’t receive complain about performance (about the hashing process) but like mentioned the volume is probably not high enough.
It’s worth to mention that we never really focused on performance and that OA is written in pure JS. There is room for improvements.


This Email is filed in GovTech DRMS<https://drms.in.tech.gov.sg/?xmail-id=d2b6c036-8f20-4f67-976e-7af2b9b01238>
[3S-id=d2b6c036-8f20-4f67-976e-7af2b9b01238:a0a98837]

From: Rui Jie Chow <chow.ruijie@gmail.com>
Sent: Monday, 29 March 2021 12:46 PM
To: Maillet LAURENT (GOVTECH) <Maillet_LAURENT@tech.gov.sg>
Subject: Fwd: RDF Dataset Canonicalization - Formal Proof

Hi Laurent,

you're probably the best situated person to respond to this

Cheers,
RJ

---------- Forwarded message ---------
From: Steve Capell <steve.capell@gmail.com<mailto:steve.capell@gmail.com>>
Date: Mon, Mar 29, 2021 at 4:35 AM
Subject: Re: RDF Dataset Canonicalization - Formal Proof
To: Tobias Looker <tobias.looker@mattr.global<mailto:tobias.looker@mattr.global>>
Cc: Christopher Allen <ChristopherA@lifewithalacrity.com<mailto:ChristopherA@lifewithalacrity.com>>, Adrian Gropper <agropper@healthurl.com<mailto:agropper@healthurl.com>>, Alan Karp <alanhkarp@gmail.com<mailto:alanhkarp@gmail.com>>, Drummond Reed <drummond.reed@evernym.com<mailto:drummond.reed@evernym.com>>, Manu Sporny <msporny@digitalbazaar.com<mailto:msporny@digitalbazaar.com>>, W3C Credentials CG (Public List) <public-credentials@w3.org<mailto:public-credentials@w3.org>>

Hi Tobias

Good questions - which I’ve forwarded to the Singapore team for an authoritative answer

Here’s my non-authoritative attempt
- salts are an array of uuids I think - see https://edi3.org/specs/edi3-notary/develop/#611-salting-the-data

- signature correlation - not sure but I’d mention that all use cases for this approach so far are for cross border trade documents where the subject is a public identifier such as a business number.  The design intent is that the identity is correlatable.
- we haven’t noticed performance issues of any significance but we are talking volumes of only a few million per year

Steven Capell
Mob: 0410 437854


On 28 Mar 2021, at 2:53 pm, Tobias Looker <tobias.looker@mattr.global<mailto:tobias.looker@mattr.global>> wrote:

> I’m a big fan of this approach, a form of redaction distinct from zk forms of selective disclosure.

> There was an attempt to spec one here in the CCG three-four years ago, but it died on the vine.

I'm also interested in learning more about this approach too, the questions I had last time were

1. How the salt for each redactable statement would be managed in a way that would not leak the abstraction that "Linked Data Proofs" sets up. For example would the attached proof block have to have a long array of salts?
2. Proof sizes, having to have a salt per-statement signed as a part of the proof would significantly increase the size of the proofs representation.
3. Signature correlation, perhaps not important in this scheme, but I think the approach would require revealing a fixed signature regardless of which parts are redacted from the original proof?
4. Performance? Also perhaps a non-issue but if anyone has info/benchmarks around how the scheme might scale with the size of the data graph signed, that would be great to look at?

Thanks,
[Mattr website]<https://mattr.global/>



Tobias Looker

Mattr

+64 (0) 27 378 0461
tobias.looker@mattr.global<mailto:tobias.looker@mattr.global>

[Mattr website]<https://mattr.global/>

[Mattr on LinkedIn]<https://www.linkedin.com/company/mattrglobal>

[Mattr on Twitter]<https://twitter.com/mattrglobal>

[Mattr on Github]<https://github.com/mattrglobal>




This communication, including any attachments, is confidential. If you are not the intended recipient, you should not read it - please contact me immediately, destroy it, and do not copy or use any part of this communication or disclose anything about it. Thank you. Please note that this communication does not designate an information system for the purposes of the Electronic Transactions Act 2002.


On Sun, Mar 28, 2021 at 3:49 PM Christopher Allen <ChristopherA@lifewithalacrity.com<mailto:ChristopherA@lifewithalacrity.com>> wrote:
On Sat, Mar 27, 2021 at 7:22 PM Steve Capell <steve.capell@gmail.com<mailto:steve.capell@gmail.com>> wrote:
The Singapore government https://www.openattestation.com/ does this already . Version 3 is W3C VC data model compliant

Each element is hashed (with salt I think) and then the hash of the hashed is the document hash that is notarised

The main rationale is selective redaction (because the root hash is unchanged when some clear text is hidden). But I suppose it simplifies canonicalisation too...

I’m a big fan of this approach, a form of redaction distinct from zk forms of selective disclosure.

There was an attempt to spec one here in the CCG three-four years ago, but it died on the vine.

I’d be interested is seeing this spec & implementation. Any links?

— Christopher Allen [via iPhone]



This communication, including any attachments, is confidential. If you are not the intended recipient, you should not read it - please contact me immediately, destroy it, and do not copy or use any part of this communication or disclose anything about it. Thank you. Please note that this communication does not designate an information system for the purposes of the Electronic Transactions Act 2002.
Received on Monday, 29 March 2021 06:47:41 UTC