Re: Signing and Verifying RDF Datasets for Dummies (like Me!) from Eric Prud'hommeaux on 2021-06-12 (semantic-web@w3.org from June 2021)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Sun, 13 Jun 2021 00:11:49 +0200
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: semantic-web@w3.org
Message-ID: <20210612221149.GF8464@w3.org>
On Fri, Jun 11, 2021 at 07:37:57AM -0400, Peter F. Patel-Schneider wrote:
> On 6/11/21 3:33 AM, Eric Prud'hommeaux wrote:
> > On Wed, Jun 09, 2021 at 01:45:10PM -0400, Peter Patel-Schneider wrote:
> 
> [...]
> 
> > > But the receiver needs to perform two expansions.  The first happens
> > > when the receiver runs the verify algorithm.  The second happens when
> > > the receiver uses the transmitted document to construct the RDF
> > > dataset.  These two operations can be separated by an arbitrary amount
> > > of time and can be done in different environnments with the result that
> > > the recceiver constructs a different dataset from what the verify
> > > algorithm verified.
> > >
> > > And remote contexts and other environmental concerns can be constructed
> > > and manipulated so that the verify algorithm sees what the originator
> > > signed and thus returns true but the receiver constructs something
> > > different.  This can happen by accident, such as when a remote context
> > > is updated, or by an opponent, for example by modifying a transmitted
> > > document to inject a remote context that is modified between
> > > verification and construction.
> >
> > Some libraries probably double-expand by default, and you're right,
> > that's neither efficient nor safe. That deserves some text in the spec
> > and a test endpoint that tries to exploit it (e.g. it maps
> > `p`=>`http://a.example/p1` on the first GET,
> > `p`=>`http://a.example/p2` on the 2nd, etc).
> >
> > If someone finds it much easier to implement their library with
> > double-expansion, it can still be safe to double-expand if the
> > documentLoader overrides cache pragmas for the duration of a
> > verification. By default, the Digital Bazaar stack works with local
> > copies anyways so you have to go to some effort to create Manu's
> > "footgun".
> 
> The point here is that the algorithms in
> https://w3c-ccg.github.io/ld-proofs/ require double expansion so any library
> that uses these algorithms to both verify and expand will have to do double
> expansion.  And to prevent manipulation all this has to be done within the
> trust boundary.  And time and space have to be suspended.

A couple messages back:
[[
Date: Wed, 9 Jun 2021 13:10:28 +0200
From: Eric Prud'hommeaux <eric@w3.org>
To: Peter Patel-Schneider <pfpschneider@gmail.com>
Cc: semantic-web@w3.org
Subject: Re: Signing and Verifying RDF Datasets for Dummies (like Me!)
Message-ID: <20210609111028.GA6976@w3.org>
]]
, I stepped through the algorithm and showed that it takes as input an
RDF graph. Given that, I don't see the justification for your
assertion that it requires double expansion.


> Why require all this extra effort?  If the use of a document format that has
> a unique expansion to an RDF graph or dataset is required then none of this
> is necessary.

I think "unique expansion" means no BNodes. If so, that's a pretty
small subset of the RDF in common use.


>                 The trust boundary can enclose a much smaller area.  The
> document itself is signed so the validation is much closer to the standard
> validation for documents.  Recipients can expand at their leisure.  (In any
> case some recipients will expand at their leisure, trusting that this is
> allowable because the validation succeeded.  The only way to prevent this
> would be to encrypt the message so that only trusted libraries can expand
> it.)

No spec can prevent bad implementations; the best they can do is work
with a community to see how to express the spec in a way that enables
good implementations.

It's trivial for an implementation to expand a JSON-LD document to
RDF, pass it to canonicalizer, then hash it and verify the
signature. If the sig is good, the app has the expanded doc to use
however it sees fit. That document was expanded once in this process.


> > This issue is further evidence that a WG product would increase
> > security and community understanding around security issues. Most of
> > the obvious ways ways to sign JSON-LD introduce this sort of
> > vulnerability. No WG leads to any of:
> >
> > 0. No action: most folks won't consider dereferenced evaluation
> >    vulnerabilities present in JSON-LD pipelines that don't include
> >    some verification.
> >
> > 1. standard JWS over JSON-LD doc: this signs the JSON tree but not the
> >    RDF expansion.
> >
> > 2. Homegrown signature stacks: likely to include atomic operations
> >    that separate verification from expansion (for e.g. populating a
> >    store) is subject to your timing attack.
> >
> > A WG product can raise awareness of these issues for these issues
> > across all JSON-LD pipelines (or any dereferenced evaluation
> > pipelines) and provide recipes and tools for securing them.
> 
> If you want to send a JSON-LD document, send it as a signed document.

Do you mean (here and following) to just sign the bytes of the JSON-LD
document? You were just arguing that the double expansion of a JSON-LD
document is a flaw in RDF Signatures. Now it appears you're suggesting
that folks sign the JSON, which takes zero precaution against somone
changing the context at any point. It appears you have much looser
requirements for the do-nothing proposal than for the charter.

>                                                                         If
> you want to send an RDFa document, send it as a signed document.  If you
> want to send a Turtle document, send it as a signed document.  If you want
> to send an RDF graph or dataset send it as a signed N-Triples or N-Quads
> document.  Don't send JSON-LD or RDFa or Turtle and some other stuff.  If
> you want to canonicalize an RDF graph or dataset, canonicalize it.  If you
> want to canonicalize and send, send signed a canonicalized N-Triples or
> N-Quads document.   There is enough here for a WG.

I suspect you're restricting the charter here but I'm not sure I
understand the proposal. You'll have to be more explicit about what
you intend to rule out.


> If you want to create a vocabulary for proofs and other verification data
> create a vocabulary.   There is enough here for another WG.
> 
> peter
> 
> 
> 
> peter
> 
>
Received on Saturday, 12 June 2021 22:12:12 UTC