Re: Selective Disclosure for W3C Data Integrity

Hi Greg, you beat me to it, I had started sketching this out - a couple of
tiny details to add. Probably largely irrelevant...

(I'm going from memory here, so someone pull me up if I'm wrong.

1. The sorted hash array is first concatenated into a single string.
2. The salts (currently) are a base 64 encoded string representing a
mapping from field to salt. I don't think that the OA obfuscation process
pulls out the obfuscated salts.
3. Tree structure is relevant when "wrapping" (OA terminology for computing
the merkle root) a batch of documents - each document contains a unique
"target hash" - the one you described above - and every document also
contains the list of all target hashes in the batch (except its own) and
all documents have the same merkle root value. To verify, OA then
concatenates 3 lists - revealed keys, an array of target hash values for
the batch, and the obfuscated hashes.

I'm not sure whether the batching happens anymore though, I think it was
largely because the earlier versions notarized the root hash on an eth
smart contract, so batches cheaper.


On Wed, 7 Jun. 2023, 11:29 am Ren Yuh KAY (IMDA), <KAY_Ren_Yuh@imda.gov.sg>
wrote:

> Adding Hendry
>
>
>
> Please kindly include him in future correspondence.
>
>
>
> *Warmest Regards,*
>
>
>
> *Ren Yuh KAY (Mr)*
>
> Assistant Director, TradeTrust, Digital Utilities, Sectoral Transformation
> Group
>
> *D* (+65) 6211 0550   *M* (+65) 9199 6628
>
> *This e-mail (including any attachments) may contain confidential or
> legally privileged information. Any unauthorised use, retention,*
>
> *reproduction or disclosure is prohibited and may attract civil and
> criminal penalties. If this e-mail has been sent to you in error, *
>
> *please delete it and notify us immediately. Please consider the
> environment before you print this email.*
>
>
>
>
>
> *From:* Greg Bernstein gregb@grotto-networking.com
> *Sent:* Wednesday, June 7, 2023 7:23 AM
> *To:* steve capell steve.capell@gmail.com
> *Cc:* Manu Sporny msporny@digitalbazaar.com; Dave Longley
> dlongley@digitalbazaar.com; John, Anil <anil.john@hq.dhs.gov>; W3C
> Credentials CG <public-credentials@w3.org>; Richard Spellman <
> richard.spellman@gosource.com.au>; Sin Yong LOH (IMDA) <
> LOH_Sin_Yong@imda.gov.sg>; Ren Yuh KAY (IMDA) <KAY_Ren_Yuh@imda.gov.sg>
> *Subject:* Re: Selective Disclosure for W3C Data Integrity
>
>
>
> Thanks Steve, let me know if my psuedo algorithm captures the essence of
> your approach and that the overheads seem right. Also the reason for salts
> seems different than the SD-JWT case (for unlinkability).
>
> Steps for a straight forward Merkel Tree approach to Selective Disclosure
> based on *Open Attestation*. Parameters: hash algorithm for tree,
> signature algorithm, salt size and generation approach.
>
> *Signed Document Creation*:
>
>    1. Input is a JSON object
>    2. “Flatten” the JSON object into a list of individual properties and
>    values (the library they use to do this is reversible)
>    3. For each (property, value) tuple from above add a different *salt*
>    value. The purpose of this salt is to prevent inferring the (property,
>    value) tuple from the hashed value in the case where these have limited
>    range. For example may only take on a small set of values.
>    4. The hash of each (property, value, salt) tuple is taken and put
>    into a *sorted* list.
>    5. A hash of the *entire* sorted list is taken. This is the value that
>    is then signed with a signature algorithm.
>    6. The list of triples (property, value, salt) is sent along with the
>    signature. Note that salts expand the size of the information a bit.
>
> *Selectively Disclosed Document Creation*:
>
>    1. Input received list of triples (property, value, salt) and signature
>    2. Create an empty list for “Obfuscated Data”.
>    3. For each triple (property, value, salt) that is to be *elided*,
>    (not disclosed) compute its hash and add that hash value to the “Obfuscated
>    Data” list.
>    4. Send only the disclosed triples, (property, value, salt), the
>    “Obfuscated Data” list, and the signature to the verifier.
>
> *Verifying Selectively Disclosed Document*:
>
>    1. For each disclosed (property, value, salt) triple compute the hash
>    and place it in a list.
>    2. Add the hashes from the “Obfuscated Data” list to the above list.
>    3. Sort the combined list from above. Take the hash of this sorted
>    list.
>    4. Verify the above hash against the signature using the verification
>    algorithm.
>
> Notes:
>
>    - This approach has fairly low additional overhead. A Salt added for
>    each (property, value) pair by the issuer and included with all revealed
>    tuples. A hash value for each non-disclosed tuple.
>    - Note that the concept of a tree is implicit here and not used to
>    further advantage. However, the structure of the approach is very straight
>    forward and can be applied in different settings.
>    - The salts are needed primarily for confidentiality of the elided
>    data. However, if a new set of salts is used for every time a signature is
>    generated this can be used to prevent linking when we have *verifier*
>    to *verifier* collusion.
>
> On 6/5/2023 1:39 PM, steve capell wrote:
>
> There’s a bit on the salted hash approach on this page
> https://www.openattestation.com/docs/docs-section/how-does-it-work/document-integrity.
>  Written more from a developer user perspective than from a standards
> specification perspective - although I believe the Singapore team are
> writing it up as a specification.  Kay?  Is there a link for a draft
> specification on this?
>
>
>
> On 6 Jun 2023, at 3:39 am, Greg Bernstein <gregb@grotto-networking.com>
> wrote:
>
>
>
> I’ve seen the salted hash approach in SD-JWT to prevent “verifier to
> verifier” collusion (tracking) with fairly arbitrary signature algorithms.
> If we are just interested in ECDSA then we should be able to use the
> “random version of ECDSA” rather than the “Deterministic ECDSA” to achieve
> the same functionality without the need for a salt.
>
> Was just writing up a PR on “security considerations” for ECDSA
> Cryptosuite v2019 <https://github.com/w3c/vc-di-ecdsa> and while
> recommending Deterministic ECDSA left the option for random ECDSA.
>
> Is there a reference for the “salted hash tree” approach?
>
> Cheers
>
> Greg B. Grotto Networking <https://www.grotto-networking.com/>
>
>
>
> On 6/3/2023 6:48 PM, Steve Capell wrote:
>
>
>
>
>
> Thanks Manu
>
>
>
> Happy to participate in these tests and calculations
>
>
>
> I can see how ecdsa-sd could be sufficient efficient (pending test results).  How would we address the requirement for any holder along the supply chain to redact? Can you see a way to blend the salted hash tree model with ecdsa-sd?
>
>
>
> I agree with Richard’s observation that when we stop trying to copy the paper then there’s potentially a lot less need for redaction - but I suspect we’re in for a longish transition period, particularly for supply chain documents like invoices, waybills, and conformity certificates
>
>
>
> Steven Capell
>
> Mob: 0410 437854
>
>
>
> On 4 Jun 2023, at 1:41 am, Manu Sporny <msporny@digitalbazaar.com> <msporny@digitalbazaar.com> wrote:
>
>
>
> On Wed, May 31, 2023 at 4:48 AM Steve Capell <steve.capell@gmail.com> <steve.capell@gmail.com> wrote:
>
> Regarding the size / cost volumetrics I don’t have concrete metrics but I’ll say it’s not uncommon for trade documents like invoices and waybills to have dozens or even hundreds of lines.
>
> The reason I asked is because it would be nice if we could run some
>
> tests w/ ecdsa-sd and your supply chain use cases. Here are some
>
> situations where Data Integrity for Selective Disclosure (ecsda-2023)
>
> will work out well:
>
>
>
> * You have a large document with many claims (100+) that must be
>
> mandatorily disclosed (these are all lumped into a single hash in
>
> ecdsa-sd and so costs little), and only a few (1-30) that you want to
>
> be selectively disclosed (and only a few of those are disclosed at a
>
> time -- this costs about 66 bytes per revealed claim).
>
>
>
> * You have a small document with a handful of fields (1-30) that you
>
> want to be selectively disclosed (and only a few of those, 1-5, are
>
> disclosed at a time -- again, 66 bytes per revealed claim).
>
>
>
> For the Data Integrity for Selective Disclosure work, we are working
>
> on a Google Sheet that allows you to input the total number of
>
> statements, total number of mandatory disclosure claims, total number
>
> of selective disclosure claims, total number of objects without
>
> identifiers, and it will spit out the initial proof size, and then the
>
> selective disclosure proof size (based on how much you're disclosing).
>
> Having something like that for your merkle-based mechanism, SD-JWT,
>
> and BBS would be useful to the community. We'd prefer if each
>
> community provided the calculations, but if that doesn't happen, we
>
> might just put something out there and see how well we did at
>
> analysing the cryptographic variables. We're happy to be told we're
>
> wrong in order to get to more accurate numbers for the ecosystem to
>
> compare/contrast.
>
>
>
> -- manu
>
>
>
> --
>
> Manu Sporny - https://www.linkedin.com/in/manusporny/
>
> Founder/CEO - Digital Bazaar, Inc.
>
> https://www.digitalbazaar.com/
>
>
>
> ​
>
> <OpenPGP_0x80179D68654AA86C.asc>
>
>
>
> ​
>

-- 


---
The content of this email and attachments are considered 
confidential. If you are not the intended recipient, please delete the 
email and any copies, and notify the sender immediately.  The information 
in this email must only be used, reproduced, copied, or disclosed for the 
purposes for which it was supplied.

Received on Wednesday, 7 June 2023 04:20:23 UTC