Re: On JSON-LD with DIDs and VCs from Daniel Hardman on 2020-01-09 (public-credentials@w3.org from January 2020)

From: Daniel Hardman <daniel.hardman@evernym.com>
Date: Wed, 8 Jan 2020 17:09:36 -0700
To: Joe Andrieu <joe@legreq.com>
Cc: Credentials Community Group <public-credentials@w3.org>
Message-ID: <CAFBYrUqMoGDPszbNrQ0JWUv2xsfFN9JPH+9H_Px9L92EtFQGVQ@mail.gmail.com>
+1 to all of Joe's comments.

On Wed, Jan 8, 2020 at 4:19 PM Joe Andrieu <joe@legreq.com> wrote:

> My apologies for bringing in AI. My point was that it would take magical
> superpowers for resolving the context at verification time to make any
> sense. Because there aren't magical superpowers, resolving the context at
> verification time makes no sense.
>
> ALSO, we seem to be conflating concerns about VCs with concerns about DID
> Documents. In particular, VCs are a formal recommendation. They use
> JSON-LD. That debate was resolved long ago.
>
> However, an issue under consideration is whether or not JSON-LD is an
> appropriate mechanism for DID Documents or if JSON is sufficient. This is
> absolutely in scope for the current working group (and that part of this
> discussion should probably more to the DID WG).
>
> To that end, Oliver, is there a version of this concern that applies to
> how a DID Document's @context is an attack vector? I would be better able
> to respond to that concern if you could describe how that could happen.
>
> If someone wants to propose that VCs move away from JSON-LD, we can have a
> conversation about that. However, I expect that would be considered beyond
> "maintenance" and require a full (re)chartering of a working group.
>
> To respond to the VC question in the context of the community group, I
> believe the attack vector as described is almost certainly a non-issue, for
> at least two reasons.
>
> First, if the verifier has not seen a specific context before--as a
> lexical match on the entire URL string or JSON compare of an inline
> context--it SHOULD treat it as an unknown context and simply stop
> processing. Only developers are going to actually resolve any context urls.
> The verifier should just be confirming the contexts are something already
> programmed for, which means the context documents should also already be
> cached if you need them for your parser. (As others have mentioned, hashes
> can confirm that the intended context is the same as the cached version.)
> So the attack vector describe only applies if the verifier is intentionally
> treating distinct URLs as identical because they are ignoring query
> parameters or the like. IMO, this treatment of distinct URLs as identical
> should be explicitly out of conformance. We may need to add language to
> clarify that (which would be a suitable errata).
>
> Second, if the issuer and verifier both want correlatable identifiers, we
> can't stop it. There are at least three ways I can think of off the top of
> my head:
>
> 1. They could use a trailing component in the @context URL *and* the
> verifier accepts such contexts despite the proposed conformance requirement
> in the previous paragraph. This, IMO, should be treated as a non-conformant
> implementation (I don't believe the spec currently addresses this).
>
> 2. If we have any extensibility at all, the issuer can just add a property
> with a UUID or even a "phoneHome URL" that embeds an ID. Colluding
> verifiers can resolve this URL at verification time just as they would the
> context URL. If the verifier WANTS to phone home, we can't stop that.
>
> 3. The issuer can ALREADY do this attack through the credentialStatus
> property without extending anything. That property is designed to use
> non-correlatable means for checking a status, but there's no way to prevent
> an issuer from setting up a status mechanism that is a correlatable request
> back to the issuer. In fact, we can expect that this WILL happen. It would
> be good to win the narrative that this is an privacy anti-pattern, but like
> option #1, the VC spec is silent on this. We *could* require that the URL
> be non-correlatable and to my mind that would be a useful errata.
>
> So... this does highlight two potential errata for updating the VC spec to
> move #1 and #3 to violations of normative requirements. That would be a
> nice outcome from this thread.
>
> However, #2, is ONLY resolvable if you don't allow ANY extensibility,
> which I don't believe anyone is arguing for. You have this problem with
> both JSON and JSON-LD. In either one, if there is any extensibility,
> issuers can ALWAYS add properties that include correlatable identifiers and
> phone home service endpoints. Full stop. It has nothing to do with whether
> or not a context property allows that.
>
> In short, if you want to avoid these kinds of attacks, we can't have
> extensibility. Or more rigorously, **I** don't know of an extensibility
> approach that would let issuers extend a VC without also allowing
> properties **we** don't like. That's the purpose of extensibility, to
> enable customizations that have not gone through a standardizations
> process. That's a reasonable argument against extensibility, but it has
> nothing to do with the @context property and JSON-LD.
>
> -j
>
> On Wed, Jan 8, 2020, at 8:56 AM, Oliver Terbu wrote:
>
> I guess I have to clarify a few things because apparently the whole AI
> thing was misunderstood. Find my comments below.
>
> On Wed, Jan 8, 2020 at 5:08 PM Manu Sporny <msporny@digitalbazaar.com>
> wrote:
>
> On 1/8/20 6:05 AM, Oliver Terbu wrote:
> > On the other hand, I now understand that to solve the namespace
> > problem people are happy to sacrifice security and privacy for
> > extensibility.
>
> No, that's not what's being said at all. I don't think you understand
> what people are saying in this thread.
>
> People are saying: We don't have to sacrifice anything -- you can get
> security, privacy, AND extensibility with Verifiable Credentials as
> designed.
>
>
> I do think that having JSON-LD-enabled for verifiers has security and
> privacy implications as described in my previous emails. This security and
> privacy considerations could have been completely mitigated by not using
> JSON-LD. Note, I am NOT saying the VC spec is flawed. The spec allows
> verifiers to decide whether to make use of JSON-LD:
>
> "Though this specification requires that a @context property be present,
> it is not required that the value of the @context property be processed
> using JSON-LD. This is to support processing using plain JSON libraries,
> such as those that might be used when the verifiable credential is encoded
> as a JWT."
>
> However, my question then was, why should verifier use a JSON-LD library
> at all?
>
>
>
> > I am very glad that Joe pointed that out that there is no AI that
> > would allow applications to process any variation of credentials just
> > because JSON-LD is used. This is exactly what I always said.
>
> Yes, but no one that knows what they're talking about has ever said
> that JSON-LD is a magic bullet that will solve that problem. You seem to
> be presenting a strawman argument.
>
> I also reject the notion that AI can solve this problem... let's not
> even talk about that as an option. Every time someone alludes to "AI"
> solving anything, I just replace "AI" with "magic". Please stop, AI is
> magic. Generative Adversarial Neural Networks specifically tuned to a
> particular problem space are not magic, and again, are not a silver
> bullet. :)
>
> We don't need any of that magic for Verifiable Credentials to operate as
> designed.
>
> You seem to be asserting that someone has stated that by using JSON-LD
> that they'll be able to *safely* process Verifiable Credentials where
> they don't understand the semantics of the credential. If someone has
> stated that, they are completely and absolutely wrong. That is magic.
>
>
> Absolutely agree! Please let's don't talk about AI. I never said anything
> else, or it was just misinterpreted. :)
>
>
> JSON-LD isn't magic that enables you to understand semantics that you
> had previously not understood. Software always needs to be written to
> understand the semantics -- for the next decade or more, by a human
> being that understands the semantics. What JSON-LD gives us is the
> ability to precisely identify semantic concepts in a decentralized
> manner such that every market vertical on the planet isn't forced
> through some slow W3C/IETF/OASIS standards setting process just so that
> they can have the Verifiable Credential that their market vertical needs.
>
>
> Yes, I do understand that. This is what I referred to as "extensibility"
> which I do see as a benefit but which I don't see as a legitimate reason to
> accept any tradeoffs on security and privacy. That is why I'm arguing for
> JSON-only verifiers.
>
>
>
> That is, JSON-LD gives us the ability for people to innovate at the
> edges with the types of Verifiable Credentials that are produced and
> consumed. JSON-LD *does not* give a computer the ability to magically
> understand semantics that it isn't programmed to understand.
>
>
> Fully agree and I have never said anything else.
>
>
>
> If you think the latter, you fundamentally misunderstand JSON-LD. If you
> think the JSON-LD community is espousing the latter, you fundamentally
> misunderstand the community's mental model.
>
> > However, the problem that I described is not about an arbitrary
> > context, it is about the same context under a different URL, or
> > having just an additional meaningless context that serves as a
> > tracking cookie. The JSON-LD spec still allows the retrieval of
> > references to a remote context. Note, the validation checks in the VC
> > spec are non-normative, so technically malicious issuers are able to
> > abuse that behaviour without producing invalid VCs.
>
> The Verifiable Credentials spec uses a restricted form of JSON-LD. The
> discussion in this thread is about best practices and implementations.
> If we find that we all agree on the best practice, then we can update
> the spec to contain the limitations we're discussing right now. To put
> it another way:
>
>
> I would support that and introduce some normative requirements for JSON-LD
> verifiers for validation checks.
>
>
>
> C, Rust, Java, Python, Javascript, TLS, and JSON parsers all give you a
> thousand ways to blow your foot off. That doesn't mean that those specs
> are wrong -- they're flexible by design. What separates good programs
> that use those technologies from bad ones is that the bad ones blow your
> foot off when you don't expect them to, and the good ones protect all
> your toes.
>
>
> JSON does not have these issues, it is a data interchange format and
> nothing else. JSON-LD on the other hand defines a lot of characteristics
> that are not needed. Amongst others, retrieving a remote context, is one of
> them and that is the reason why we are having this discussion.
>
>
>
> We have text that talks about this in the spec, namely:
>
> https://w3c.github.io/vc-data-model/#extensibility
>
> """
> Though this specification requires that a @context property be present,
> it is not required that the value of the @context property be processed
> using JSON-LD. This is to support processing using plain JSON libraries,
> such as those that might be used when the verifiable credential is
> encoded as a JWT. All libraries or processors MUST ensure that the order
> of the values in the @context property is what is expected for the
> specific application. Libraries or processors that support JSON-LD can
> process the @context property using full JSON-LD processing as expected.
>
>
> I'm aware of this language in the spec and I'm quite ok with that. My
> point is that why should anyone do anything else as a verifier? Because
> ignoring JSON-LD as a verifier would result in a more interoperable, more
> secure and more efficient solution.
>
>
>
> ...
>
> A dynamic extensibility model such as this does increase the
> implementation burden. Software written for such a system has to
> determine whether verifiable credentials with extensions are acceptable
> based on the risk profile of the application. Some applications might
> accept only certain extensions while highly secure environments might
> not accept any extensions. These decisions are up to the developers of
> these applications and are specifically not the domain of this
> specification.
>
> Developers are urged to ensure that extension JSON-LD contexts are
> highly available. Implementations that cannot fetch a context will
> produce an error. Strategies for ensuring that extension JSON-LD
> contexts are always available include using content-addressed URLs for
> contexts, bundling context documents with implementations, or enabling
> aggressive caching of contexts.
> """
>
> If that's not good enough, we can improve that text in the future once
> the 2020 VCWG spins back up. We can add text to elaborate on this in the
> implementation guidance.
>
>
> Yes, I would support that.
>
>
>
> > If the only thing you need is to identify that a response is a
> > certain object, then there are of course simpler solutions even based
> > on the current W3C VC spec.
>
> Simpler solutions, like?
>
>
> I'm not against the context in general. For example, you could still use
> the information provided in the context for that. Note, I was
> concerned about JSON-LD verifiers and I was not worried about providing a
> valid context in the VC.
>
>
>
> -- manu
>
> --
> Manu Sporny (skype: msporny, twitter: manusporny)
> Founder/CEO - Digital Bazaar, Inc.
> blog: Veres One Decentralized Identifier Blockchain Launches
> https://tinyurl.com/veres-one-launches
>
>
> --
> Joe Andrieu, PMP
>                    joe@legreq.com
> LEGENDARY REQUIREMENTS
>    +1(805)705-8651
> Do what matters.
>                  http://legreq.com <http://www.legendaryrequirements.com>
>
>
>
Received on Thursday, 9 January 2020 00:09:55 UTC