- From: Dave Longley <dlongley@digitalbazaar.com>
- Date: Thu, 27 Jul 2017 17:47:08 -0400
- To: Tristan Hoy <tristan.hoy@gmail.com>, public-vc-wg@w3.org
On 07/27/2017 02:48 PM, Tristan Hoy wrote: > The current draft architecture for Verifiable Claims describes a > single point of privacy failure: the identifier registry. Perhaps "registry" is a poor name. Perhaps failing to pluralize it is also an issue. > > Here is a set of cascading requirements, assuming the registry is > some kind of "server": 1) The registry MUST be resilient to denial of > service attacks 2) The registry MUST be able to discriminate between > high-volume inspectors (e.g. Walmart, government agencies) and DDoS > attackers 3) Therefore, the registry MUST authenticate inspectors > > And this gives the registry access to all of the metadata concerning > who an identity holder is interacting with. While this is perfect in > a government or corporate environment where every interaction will be > logged regardless, it is not good for privacy. There does not have to be a single identifier registry. There could be government or corporate environments where using a server as an identifier register is appropriate. For other environments, it would not be, as you've indicated. > > Unless of course, the registry is a blockchain, and each inspector is > running their own node. There is some very specific jargon that > indicates that this may be the not-so-opaque intention of the > architecture: "The registry MUST manage identifiers in a > self-sovereign way" > > However this is speculation. Using a blockchain as a registry is one possibility, yes. > > What isn't speculation is that this architecture cannot possibly > support interaction privacy unless the identity registry is a > decentralized name service or some other flavour of blockchain. This is not true, but I can see why you'd think so given the name "registry". Identifiers in the architecture are simply URIs. Having an identifier registry in the way you've conceptualized isn't actually a hard requirement, it is just an enabler for certain use cases. For example, identifiers could be based on public keys. Here there is no actual "registry" but there are rules or a "namespace" for identifiers. Proof of possession (when sharing verifiable claims) could depend on a digital signature from the private key holder. This has a number of disadvantages (potentially no key rotation) for long term verifiable claims, but it would work perfectly well for a number of other use cases. Consider the case, for example, where issuers are highly-available and are able to dynamically sign pairwise identifiers from relying parties (where proof of possession is performed via digital signature). It's hard to see that use case with the "registry" terminology though. We should figure out a way to address this in the architecture documentation. We welcome any specific text that you think would be helpful. > > The working group charter states: "The Working Group will > not...attempt to lead the creation of a specific style of supporting > infrastructure" > > But that's exactly what's happening: the registry is a required > component, and if you want interaction privacy, the registry has to > be a blockchain. > > And this comes with potentially fatal drawbacks: decentralized name > services critically lack the ability to block or revoke fake, hacked, > spam and lost identities in the same way that SSL/DNSSEC/Estonia > e-residency do. And because the blockchain is public, any > implementation flaw (which is probable considering the high > complexity) will permanently break privacy for all users. For general purpose use it is true that a blockchain would need to be public. There may be some communities that use a blockchain that isn't public. That being said, the architecture makes no assumptions about how proof of control, revocation, etc. are implemented by the identifier registry. There is a DID specification related to this space here: https://opencreds.github.io/did-spec/ I think the statement "any implementation flaw will permanently break privacy for all users" seems hyperbolic. Can you provide some more concrete examples where you're concerned? > > If the architecture was agnostic about the issuance and verification > of authenticating subject identifiers, then you could have privacy > without a blockchain. Well, the architecture is agnostic, but we've failed to communicate this. We should find a way to make it more clear that the architecture supports identifier registries but that they aren't required -- or rename "registry" to "namespace" or similar. > > The FAQ states in the answer to Q7: "The proposed data model and > syntaxes are designed to be storage system and transaction protocol > agnostic" > > But it's not transaction protocol agnostic. The use of a registry > implies a specific transaction protocol that is either technology > specific or privacy violating. > > Buried in the details, the data model recommends the use of > short-lived or single-use bearer tokens (e.g. a public-key signed > JWT) for high-privacy applications. These bearer tokens would not > require a central registry, although this is not stated. > > Another alternative is to simply use per-claim public/private > keypairs, which are self-sovereign, self-authenticating and stateless > (no central store required). Upon presenting a claim, the claim > holder can sign a challenge issued by the claim inspector to verify > ownership (rather than just possession) of the claim. Yes, this is what I was alluding to in my response above. So it seems that most of the problem here is the document's failure to communicate the optional nature of an identifier registry -- or alternatively, the failure to communicate that an "identifier register" is an abstract concept that could include "rules for generating identifiers". It's poorly named for that purpose. > > But why isn't a high privacy option - e.g. bearer tokens, public keys > - the default configuration? Because the requirements, such as highly available infrastructure, are too demanding. It doesn't support a number of use cases, such as long term credentials that were issued by educational institutions that no longer exist. The current architecture has a broader approach that supports more use cases with less complexity -- and it allows for high privacy options that have stronger infrastructure demands to be layered on top. > > Why does the front-and-centre diagram include an identity registry, > that is either technology specific or privacy violating? > > Why does it state, nowhere, that the registry is optional? The "identifier registry" is considered a "namespace". So it isn't necessarily "optional" in that sense. I agree that we should make a better effort to explain the concept or make the concept more concretely a "registry" and indicate it is optional. So, the answer is: "We're just not communicating effectively on that point, so thank you for the feedback and we'll try and address it." If you have specific text that would be helpful, please send it to the list or use github. > > Why does it seem like the spec places the needs of specific > stakeholder groups above the absolute need for privacy? "Above" meaning what specifically? The positioning of the text in the spec? Are you asking this based on an explicit statement from the spec or implications that you've derived from your reading of it or the architecture design? Different stakeholders reading the spec may have different opinions on this matter. The goal of the spec was to capture the various stakeholders in light of use cases, not to put them into a particular order. Also, not every use case involves privacy -- many of the use cases involve modeling credentials like those placed publicly on websites or Linked In. The space is much larger than, for example, the highly pseudonymous transfer of credentials that assert atomic claims like "X is 18+ years old". > > Recommendations for the working group: 1) The "brochure" version of > the spec is the most important - and should place zero-registry, > high-privacy options first and foremost to encourage privacy-first > adoption There are nuanced opinions on this matter in the group. I'll let others speak for themselves. > 2) The high-level architecture draft and interaction diagrams use > singular language when referring to identifiers, indicating that a > claims holder has only one identifier - this should be pluralized and > indicate multiple identifiers by default to encourage privacy-first > adoption I believe the group agrees we should do this and we've been trying to do so in other specifications moving forward. We also have a lot of additional privacy and anti-correlation work to do in the data model spec: https://w3c.github.io/vc-data-model > 3) Fix the link on your proposal to point to the current home of the > data model I'll talk to someone about making that happen. Thanks for your feedback! I hope my response has been helpful. -- Dave Longley CTO Digital Bazaar, Inc. http://digitalbazaar.com
Received on Thursday, 27 July 2017 21:47:32 UTC