Re: Signing and Verifying RDF Datasets for Dummies (like Me!) from Dan Brickley on 2021-06-13 (semantic-web@w3.org from June 2021)

From: Dan Brickley <danbri@danbri.org>
Date: Sun, 13 Jun 2021 09:06:34 +0100
To: "Eric Prud'hommeaux" <eric@w3.org>
Cc: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, semantic-web@w3.org
Message-ID: <CAFfrAFoks-HACO+gw9gBHA3x5kY=fiDfHvTQwJVcAf4isuhuOA@mail.gmail.com>
On Sat, 12 Jun 2021 at 23:18, Eric Prud'hommeaux <eric@w3.org> wrote:

> On Fri, Jun 11, 2021 at 07:37:57AM -0400, Peter F. Patel-Schneider wrote:
> > On 6/11/21 3:33 AM, Eric Prud'hommeaux wrote:
> > > On Wed, Jun 09, 2021 at 01:45:10PM -0400, Peter Patel-Schneider wrote:
> >
> > [...]
> >
> > > > But the receiver needs to perform two expansions.  The first happens
> > > > when the receiver runs the verify algorithm.  The second happens when
> > > > the receiver uses the transmitted document to construct the RDF
> > > > dataset.  These two operations can be separated by an arbitrary
> amount
> > > > of time and can be done in different environnments with the result
> that
> > > > the recceiver constructs a different dataset from what the verify
> > > > algorithm verified.
> > > >
> > > > And remote contexts and other environmental concerns can be
> constructed
> > > > and manipulated so that the verify algorithm sees what the originator
> > > > signed and thus returns true but the receiver constructs something
> > > > different.  This can happen by accident, such as when a remote
> context
> > > > is updated, or by an opponent, for example by modifying a transmitted
> > > > document to inject a remote context that is modified between
> > > > verification and construction.
> > >
> > > Some libraries probably double-expand by default, and you're right,
> > > that's neither efficient nor safe. That deserves some text in the spec
> > > and a test endpoint that tries to exploit it (e.g. it maps
> > > `p`=>`http://a.example/p1` <http://a.example/p1> on the first GET,
> > > `p`=>`http://a.example/p2` <http://a.example/p2> on the 2nd, etc).
> > >
> > > If someone finds it much easier to implement their library with
> > > double-expansion, it can still be safe to double-expand if the
> > > documentLoader overrides cache pragmas for the duration of a
> > > verification. By default, the Digital Bazaar stack works with local
> > > copies anyways so you have to go to some effort to create Manu's
> > > "footgun".
> >
> > The point here is that the algorithms in
> > https://w3c-ccg.github.io/ld-proofs/ require double expansion so any
> library
> > that uses these algorithms to both verify and expand will have to do
> double
> > expansion.  And to prevent manipulation all this has to be done within
> the
> > trust boundary.  And time and space have to be suspended.
>
> A couple messages back:
> [[
> Date: Wed, 9 Jun 2021 13:10:28 +0200
> From: Eric Prud'hommeaux <eric@w3.org>
> To: Peter Patel-Schneider <pfpschneider@gmail.com>
> Cc: semantic-web@w3.org
> Subject: Re: Signing and Verifying RDF Datasets for Dummies (like Me!)
> Message-ID: <20210609111028.GA6976@w3.org>
> ]]
> , I stepped through the algorithm and showed that it takes as input an
> RDF graph. Given that, I don't see the justification for your
> assertion that it requires double expansion.
>
>
> > Why require all this extra effort?  If the use of a document format that
> has
> > a unique expansion to an RDF graph or dataset is required then none of
> this
> > is necessary.
>
> I think "unique expansion" means no BNodes. If so, that's a pretty
> small subset of the RDF in common use.
>
>
> >                 The trust boundary can enclose a much smaller area.  The
> > document itself is signed so the validation is much closer to the
> standard
> > validation for documents.  Recipients can expand at their leisure.  (In
> any
> > case some recipients will expand at their leisure, trusting that this is
> > allowable because the validation succeeded.  The only way to prevent this
> > would be to encrypt the message so that only trusted libraries can expand
> > it.)
>
> No spec can prevent bad implementations; the best they can do is work
> with a community to see how to express the spec in a way that enables
> good implementations.
>
> It's trivial for an implementation to expand a JSON-LD document to
> RDF, pass it to canonicalizer, then hash it and verify the
> signature. If the sig is good, the app has the expanded doc to use
> however it sees fit. That document was expanded once in this process.
>
>
> > > This issue is further evidence that a WG product would increase
> > > security and community understanding around security issues. Most of
> > > the obvious ways ways to sign JSON-LD introduce this sort of
> > > vulnerability. No WG leads to any of:
> > >
> > > 0. No action: most folks won't consider dereferenced evaluation
> > >    vulnerabilities present in JSON-LD pipelines that don't include
> > >    some verification.
> > >
> > > 1. standard JWS over JSON-LD doc: this signs the JSON tree but not the
> > >    RDF expansion.
> > >
> > > 2. Homegrown signature stacks: likely to include atomic operations
> > >    that separate verification from expansion (for e.g. populating a
> > >    store) is subject to your timing attack.
> > >
> > > A WG product can raise awareness of these issues for these issues
> > > across all JSON-LD pipelines (or any dereferenced evaluation
> > > pipelines) and provide recipes and tools for securing them.
> >
> > If you want to send a JSON-LD document, send it as a signed document.
>
> Do you mean (here and following) to just sign the bytes of the JSON-LD
> document? You were just arguing that the double expansion of a JSON-LD
> document is a flaw in RDF Signatures. Now it appears you're suggesting
> that folks sign the JSON, which takes zero precaution against somone
> changing the context at any point. It appears you have much looser
> requirements for the do-nothing proposal than for the charter.
>

Signing the surface form does at least make clear exactly what was the
human-facing content being signed, including its external dependencies.
Same goes for signing raw rdfa, microdata, etc

Dan





> >
> If
> > you want to send an RDFa document, send it as a signed document.  If you
> > want to send a Turtle document, send it as a signed document.  If you
> want
> > to send an RDF graph or dataset send it as a signed N-Triples or N-Quads
> > document.  Don't send JSON-LD or RDFa or Turtle and some other stuff.  If
> > you want to canonicalize an RDF graph or dataset, canonicalize it.  If
> you
> > want to canonicalize and send, send signed a canonicalized N-Triples or
> > N-Quads document.   There is enough here for a WG.
>
> I suspect you're restricting the charter here but I'm not sure I
> understand the proposal. You'll have to be more explicit about what
> you intend to rule out.
>
>
> > If you want to create a vocabulary for proofs and other verification data
> > create a vocabulary.   There is enough here for another WG.
> >
> > peter
> >
> >
> >
> > peter
> >
> >
>
>
Received on Sunday, 13 June 2021 08:07:20 UTC