- From: Jeffrey Burdges <jeffrey.burdges@inria.fr>
- Date: Mon, 14 Mar 2016 22:42:33 +0100
- To: wseltzer@w3.org, Ian Jacobs <ij@w3.org>, Manu Sporny <msporny@digitalbazaar.com>, Dave Longley <dlongley@digitalbazaar.com>, Shane McCarron <shane@halindrome.com>, Adrian Hope-Bailie <adrian@hopebailie.com>, wseltzer@w3.org, Daniel Kahn Gillmor <dkg@fifthhorseman.net>, Peter Eckersley <pde@eff.org>, Joseph Bonneau <jcb@eff.org>
- Cc: Web Payments IG <public-webpayments-ig@w3.org>, Credentials Community Group <public-credentials@w3.org>, public-privacy@w3.org, kate@torproject.org, Mike Perry <mikeperry@torproject.org>, RogerDingledine <arma@mit.edu>, Christian Grothoff <grothoff@gnunet.org>, Bruno Haible <bruno@clisp.org>
- Message-ID: <1457991753.18386.246.camel@inria.fr>
Hello, There are some censorship considerations that appear inherent in verifiable claims, which I shall briefly discuss towards the end of this mail. These should not be ignored, but I shall focus primarily on more technical privacy issues here. I believe the Verifiable Claims Task Force should adopt the view that : Verifiable Claims should not leak information about private individuals beyond what individual unambiguously recognizes that the claim communicates. As an example, a verifiable claim for "proof of age" should not leak either the user's nationality, or their mother tongue. These properties have legal protections under some national laws, so revealing them could create legal liabilities not merely for web site operators, but even browser venders. There are two particularly problematic technical scenarios that arise from this consideration. 1. Unacceptable non-determinism in many signature algorithms Verifiable Claims should not use cryptographic signature algorithms that contain extra non-deterministic entropy, such as nonces or padding, that creates a tracking identifier of the user. For example, if a user proves their age to a website on two separate occasions, then the signature operation itself should not be correlated by the website, or a third party verifier. 1.1. Specific problematic algorithms : Schnorr or ElGammel based signature algorithms employ a nonce k that usually uniquely identifies that particular signature and thus that user : https://en.wikipedia.org/wiki/Schnorr_signature https://en.wikipedia.org/wiki/ElGamal_signature_scheme In particular, ECDSA should be prohibited : https://en.wikipedia.org/wiki/Elliptic_Curve_Digital_Signature_Algorithm#Signature_generation_algorithm EdDSA is an example of a Schnorr based signature algorithm that does not necessarily violate principle because this nonce is deterministically constructed from the public key and the message : http://ed25519.cr.yp.to/papers.html EdDSA would be okay so long as the signing authority employs the same signing key every time they sign a verifiable claim. It should probably be excluded if this condition cannot be met. (1) In principle, RSA itself should be okay as one simply exponentiates the signed message : https://en.wikipedia.org/wiki/RSA_%28cryptosystem%29#Signing_messages The strongest RSA signature algorithms like RSA-PSS add entropy with their padding however. We need the padding to be completely deterministic, which means older padding protocols like PKCS of FDH. Macaroons have properties that might be interesting for a user wishing to delegate access to a resource, but they're entirely based upon a nonce that the user cannot transform. 1.2. Specific acceptable algorithms : Rerandomizable signatures appear to address this issue. https://eprint.iacr.org/2015/525 Single-use RSA blind signatures work. They present an engineering challenge in that the signing authority must issue a cache of single-use tokens to the user, but that also helps limit damage from claims/credential theft. A naive approach cannot sign complex data however. Group signature schemes in which the signing authority delegates the ability to sign information to the user work fine, as the anonymize members of the group. Again a naive approach cannot sign complex data though. Zero-knowledge proofs would likely be a vector for doing this. None of these options are everyday vanilla cryptography. 2. Dangers inherent in certificate chains Verifiable Claims should not employ chains of trust like X.509 without explicitly telling the user what they reveal to the verifier. In other words, the user agent should expose to the user the chain of signing authorities who issue that signature, when their certificates were signed, and clarify that this information gets communicated. For example, there was a discussion of users proving their age using government issued credential but restricting it to only revealing the year of their birth. In this case, the credential still shows the state or locality where they reside, so using it should make this clear to the user. There are two natural alternatives to chains of trust : First, group signature schemes provide exactly a one hop alternative with the appropriate privacy properties, but expanding that to multiple hops sound like a research problem. A major drawback is that these employ pairing. Second, zero-knowledge proofs might provide a solution here too. Again the known stuff handles one hop and expanding that sounds like a research problem. 3. Solutions In summery, there are a number of interesting research problems around doing verifiable claims in a way that communicate nothing inappropriate about the user's real identity. At present, these "research problems" sounds like an anathema to standardization though. I'd therefore suggest the Verifiable Claims Task Force look into simply backing a cryptographic competition to address the issue with privacy preserving certificate chains. I'd expect the nonce problem would not be omitted from that sort of analysis. Of course, there is always a "crypto lite" approach in which claims adhere strictly to the browser's same origin policy and the standard utilizes only existing allowable cross site interactions, like top-level GET requests. I think this does not resolve the above concerns quite as succinctly as one might imagine. Also, there was previously a discussion about the correct venue for doing verifiable clams. I think the seriousness of these privacy concerns indicates that the optimal venue might be the privacy working group : https://www.w3.org/Privacy/ 4. Ethical considerations We could potentially address the above technical privacy concerns; however, that alone does not necessarily make verifiable claims a good idea. There are profound ethical considerations as well. In particular, we should ensure that new standards cannot be used to deploy any sort of general purpose "internet passport" as this would tend to enable censorship, including harm to interoperability and online commerce. Imagine if for example a site like say Youtube or Twitter were convinced to use Verifiable Claims for age verification. At present, these sites are actively censored in China for both political reasons and economic protectionism. It seems clear the Chinese government could ensure that any Chinese deployments of verifiable claims would fail to work correctly with foreign media that represented either differing political viewpoints, or competition for protected domestic providers. Apologies for tacking this ethics bit on at the end, these are not academic considerations though, and the threat of censorship deserves to be a more prominent consideration than I have made it. As an example, the U.S. State Department spends millions funding tools like CGIProxy and Tor Browser specifically to disrupt web censorship. We should avoid creating standards that ultimately make such efforts more difficult. Aside from discussions with the W3C privacy working group, I would suggest the Verifiable Claims Task force reach out to human rights organizations to obtain better comments on ethical considerations. Of course, the EFF and EPIC are obvious groups to contact. Another is the Tor Project, maybe kate@torproject.org for example. Apologies for this mail growing so long. Best wishe, Jeff Burdges p.s. In addition to legal risks, I suspect verifiable claims create longer-term "legislative risks" for browser venders : We should naturally expect verifiable claims to be regulated under the E.U. Data Protection Directive, just like cookies, flash storage, etc. At present, these storage mechanisms are provided by the browser, but the notification requirements fall upon websites. One could imagine those notification requirements being applied to browser venders directly however. This sounds less far fetched if the browser venders have themselves already defined user-interface for some privacy sensitive information, such as verifiable claims. -------- Forwarded Message -------- From: Ian Jacobs <ij@w3.org> To: Manu Sporny <msporny@digitalbazaar.com>, Dave Longley <dlongley@digitalbazaar.com> Cc: Web Payments IG <public-webpayments-ig@w3.org> Subject: Comments on VCTF Report Date: Tue, 16 Feb 2016 20:59:32 -0600 Dear Members of the VCTF [0], Thank you for preparing a report [1] on your activities for discussion at the upcoming face-to-face meeting. I read the report and the minutes of all the interviews. I have not read the use cases [2]. I have several observations and questions that I'd like to share in advance of the face-to-face meeting. I look forward to the discussion in San Francisco. I will continue to think about topics like "questions for the FTF meeting" and "ideas for next steps." Ian [0] http://w3c.github.io/vctf/ [1] https://lists.w3.org/Archives/Public/public-webpayments-ig/2016Feb/0029.html [2] http://opencreds.org/specs/source/use-cases/ ================== * First, thank you for conducting the interviews. I appreciate the time that went into them, and you managed to elicit comments from an interesting group of people. * In my view, the ideal outcome from the task force's interviews would have been this: By focusing on a problem statement in conversations with skeptics, areas of shared interest would emerge and suggest promising avenues for standardization with buy-in from a larger community than those who have been participating in the Credentials Community Group. * With that in mind, I think the results are mixed: - The interviews included valuable feedback that I believe can be useful to focusing discussion of next steps. For example, compiling a list of concerns about the project is very useful. - I believe the report does not do justice to this useful information. * Here is why I believe the report does not do justice to the interviews: it includes information that I don't believe was part of the task force's work, which clouds what the report could most usefully communicate. Specifically: - The survey in 5.1 was not part of the task force's work [0]. - While documenting use cases [2] is valuable, I did not read in the interviewer's comments that they had considered the use cases. It would have been interesting, for example, for the interviewees to have considered the use cases, and to determine whether there was a small number of them where there was clear consensus that it was important to address them. But without connecting the interview comments to the use cases, I believe they only cloud this report. Thus, I find confusing the assertion in 6.4 that a "point of consensus" is that there are use cases. That may be the consensus of the Credentials CG that produced them, but it is not clear to me from reading the minutes that there is consensus among the interviewees on the use cases. Similarly, section 3 (Summary of Research Findings) goes beyond the work of this task force to include the use cases. * While there were a lot of valuable comments in the interviews, it would not be cost-effective to paste them all here. Here are a few synopses: - It sounded like people acknowledged the problem statement and also that this is a hard problem to solve. - Many people emphasized the opportunity to improve security and privacy. One opportunity that was mentioned had to do with user-friendly key management (which made me think of SCAI). - There is a high cost to setting up an ecosystem, and so the business incentives must be carefully considered and documented. (This is covered in 7.3 of the report.) - I found Brad Hill's comments particularly helpful: https://docs.google.com/document/d/1aFAPObWUKEiSvPVqh9w1e6_L3iH4T08FQbJIOOlCvzU/ - A number of comments seemed to me to suggest a strategy for starting work: * Start small. * Start by addressing the requirements of one industry and build from there. I heard two suggestions for "Education" and explicit advice. against starting with health care or financial services. * Be pragmatic. * Reuse existing standards (a point you mention in section 3 of the report). * I don't understand the role of section 4 ("Requirements Identified by Research Findings"). This is not listed as a deliverable of the task force [0] and it does not seem to me to be derived from the interviews. The bullets don't really say "Here is the problem that needs to be solved." I think the use cases comes closer, and we need more information about business stories as mentioned above. Talking about things like software agents helping people store claims feels like a different level of discussion. * In section 6 "Areas of Consensus: - "Current technologies are not readily solving the problem." I don't think that's the consensus point. I think that formulation suggests too strongly "and thus new technologies are needed." I think the following headline phrase is more accurate: "Reuse widely deployed technology to the extent possible." You do say something close to that in the paragraph that follows, and again in 7.8. - "Minimum First Step is to Establish a Way to Express Verifiable Claims" (Also covered in a bullet in section 4.) First of all, I did not reach that result from reading the interviews. Second, the very sentences in the paragraphs that follow suggest there is no consensus. Namely: * "Many of the interviewers suggested that having a data model and syntax for the expression of verifiable claims AS ONLY PART OF THE SOLUTION." (This suggests they may not agree that "expression" is a minimal first step and that MORE is required in a first step.) * "Some of the interviewers asserted that the technology already exists to do this and that W3C should focus on vocabulary development." (So this is a recommendation to do vocabulary work.) * "Others asserted that vocabulary development is already happening in focused communities (such as the Badge Alliance, the Credentials Transparency Initiative)." (This doesn't say anything about what W3C should do; perhaps this sentence could be attached to the previous one instead.) * "Many of the interviewers suggested that the desirable outcome of standardization work is not only a data model and syntax for the expression of verifiable claims, but a protocol for the issuing, storage, and retrieval of those claims, but acknowledged that it may be difficult to convince W3C member companies to undertake all of that work in a single Working Group charter. " (This sounds like a repeat of the first bullet.) * "In the end, consensus around the question what kind of W3C charter would garner the most support seemed to settle on the creation of a data model and one or more expression syntaxes for verifiable claims." Basically, I do not think there is a consensus to do that among the interviewees. In detail, here’s what I read: - Brad Hill: "I don't know" - Christopher Allen: (I don't see any comment) - Drummond Reed: "user-side control of key management" - John Tibbetts: "document what a credential looks like (perhaps either a data model or ontology) plus a graphical diagram" - Bob Sheets: "I have a hard time addressing that question, whatever it takes to get your group started and on the map and doing work the better." - David Chadwick: (I don't see any comment) - Mike Schwartz: (I don't see any comment) - Dick Hardt: (I don't see any comment) - Jeff Hodges: (I don't see any comment) - Harry Halpin: "Another option is to scope down and aim at a particular problem domain, for example a uniform vocabulary for educational credentials. " - David Singer: (I don't see any comment) * I found interesting the section on "areas of concern" (along with Brad Hill's comments). It might be possible to categorize the concerns like this: a) Social issues 7.2 scalability of trust 7.3 business models and economics 7.4 business model for infrastructure 7.7 liability; fraud and abuse b) Design issues 7.5 slow evolution of agent-centric designs 7.6 risks associated with identifiers, keys, revocation 7.7 reusing existing work c) Communication 7.1 communicate vision / big picture (BTW, I agree, but this does not imply it belongs in a charter). - Scalability of trust is very interesting. I think I agree it's good to have an architecture that supports diverse business models, trust models, etc. - On business models and economics: "it is yet unknown if kickstarting the market will be enough to build a strong economic incentive feedback loop." It might be easier to find an answer by adopting the above strategy points about starting small and picking one market. * Please list the editors of the report. Also, if possible, please list in an acknowledgments section of the report the participants in the task force. -- Ian Jacobs <ij@w3.org> http://www.w3.org/People/Jacobs Tel: +1 718 260 9447
Received on Monday, 14 March 2016 21:41:44 UTC