Re: FW: Selective Disclosure for W3C Data Integrity

On Fri, Jun 16, 2023 at 4:47 AM Luca Boldrin <luca.boldrin@infocert.it> wrote:
> Greg’s example adds to my feeling that the RDF “atomic claim” approach is a clean way of dealing with selective disclosure requirements, when each claim is individually signed (i.e., single-claim credentials). Each credential would then attest a link in the graph.

Yes, that is one of the benefits of a graph-based data model and is
why JSON-LD has been utilized for VCs. IIRC, ACDCs also utilize a
graph-based data model and achieve similar outcomes when it comes to
binding attributes to subject identifiers.

> Conceptually, I see no reason to group claims and then sign, only to be able to produce selectively disclosed claims later.

As Steve pointed out, there are legitimate use cases where that
approach might be better than the one proposed by ecdsa-sd. We don't
have to pick one or the other, this isn't a zero-sum decision. The
design of Data Integrity is such that both the 1) individually sign
statements approach, and 2) sign a group of statements approach can
co-exist, on the same VC data payload (via parallel signatures).

This was the breakthrough that I mentioned a week or so ago with the
AnonCreds v2 work. We finally have a mechanism where digital
signatures don't have to be zero-sum AND we can simultaneously meet
NIST requirements while enabling next generation cryptography.

> Practically, this approach may introduce some issues which I am interested to understand. I can think of:
>
> The mechanism for requesting specific claims from a verifier may be more complex (e.g., assume a verifier wants to know: “prove to me that your inventory includes a board from 2022”: how should this query be asked?). Later, the verifier needs to carefully compose different claims in order to retrieve the desired information

For simple cases, we've found that the Verifiable Presentation
Request[1] spec's Query by Example (QBE) to be adequate. For more
complex use cases, I expect there will need to be extensions made or
entirely new query languages created. For example, QBE doesn't support
range proofs, which are needed for AnonCredsv2.

It's important, though, to separate the cryptographic suite from the
credential query language. If the cryptography suite has the core
primitives that we need, then multiple query languages could be used
against the suite (QBE, PEv2, AnonCreds Presentation Request, etc.).

This is an area of active research and I don't expect the advanced use
cases to be solved with a single query language at any point in the
near future (next 1-5 years).

> There is the need to deal with a much larger set of credential schemas, revocation registries...

Hmm, don't know if I follow this. ecdsa-sd doesn't require any
credential schemas to selectively disclose. Zero. Zilch, Nada. Your
experience in the space might be influenced by mechanisms that do
require a credential definition/schema in order to selectively
disclose? An explicit design goal for Selective Disclosure for Data
Integrity was to not require credential schema definitions.

I don't understand what revocation registries have to do with this, so
perhaps you can elaborate more on what you mean there?

> There is the necessity to have universally unique identifiers also for intermediate nodes in the graph.

Yes, but only if you want to combine selectively disclosed VCs. There
are many use cases where you don't need to do this. That said, not
/every/ use case can get away with a single selectively disclosed VC.
Again, these use cases are going to take years to play out and figure
out how to best utilize this technology. I expect the time horizon for
figuring out how to do some of this stuff is in the 5-10 year range
(which isn't bad, remember that this group has existed for almost a
decade at this point and has made significant progress into the
problem space).

Orie Steele wrote:
> Good luck explaining this to business stakeholders.

Why do business stakeholders need to be exposed to this level of
detail? We don't expose them to the intricacies of using uninitialized
C variables, buffer overflows, or the details of a TLS 1.3
negotiation. Writing a selective disclosure linter that provides
warnings when a developer signs something that is not bound to a
subject ID is a deterministic process... and it's not clear that doing
so is always a desirable thing to do.

As I said above, the intricacies of selectively disclosing multiple
VCs to a Verifier and then verifying that the selectively disclosed
graph doesn't leave the Issuer or the Verifier vulnerable will be an
area of active discussion over the next decade or so. We do need to
address those concerns, but we don't do that by presuming that this is
not possible to convey to business stakeholders.

There was a time when the pushback on Verifiable Credentials was:
"Business leaders are not going to understand any VC that goes beyond
an ID card and a few fields." Fast forward to today and we have stuff
like the GS1 vocabulary and the Traceability vocabulary, which are
quite comprehensive (in a good way) and whose benefit is clear to
business leaders.

Being able to explain some variation of the above to business leaders
(if that's even necessary, which I doubt) is well within the wheel
house of this community. I know many CCG'ers that are successfully
doing that in the market today (explaining how VCs can benefit
operations to business leaders).

Hope that helps answer some of your questions, Luca. Let me know if I
missed anything.

-- manu

[1] https://w3c-ccg.github.io/vp-request-spec/#query-by-example

-- 
Manu Sporny - https://www.linkedin.com/in/manusporny/
Founder/CEO - Digital Bazaar, Inc.
https://www.digitalbazaar.com/

Received on Saturday, 17 June 2023 16:31:43 UTC