Re: Signing and Verifying RDF Datasets for Dummies (like Me!) from Eric Prud'hommeaux on 2021-06-13 (semantic-web@w3.org from June 2021)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Sun, 13 Jun 2021 11:14:06 +0200
To: Dan Brickley <danbri@danbri.org>
Cc: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, semantic-web@w3.org
Message-ID: <20210613091406.GI8464@w3.org>
On Sun, Jun 13, 2021 at 09:06:34AM +0100, Dan Brickley wrote:
> On Sat, 12 Jun 2021 at 23:18, Eric Prud'hommeaux <eric@w3.org> wrote:
> 
> > On Fri, Jun 11, 2021 at 07:37:57AM -0400, Peter F. Patel-Schneider wrote:
> > > On 6/11/21 3:33 AM, Eric Prud'hommeaux wrote:
> > > > On Wed, Jun 09, 2021 at 01:45:10PM -0400, Peter Patel-Schneider wrote:
> > >
> > > [...]
> > >
> > > > > But the receiver needs to perform two expansions.  The first happens
> > > > > when the receiver runs the verify algorithm.  The second happens when
> > > > > the receiver uses the transmitted document to construct the RDF
> > > > > dataset.  These two operations can be separated by an arbitrary
> > amount
> > > > > of time and can be done in different environnments with the result
> > that
> > > > > the recceiver constructs a different dataset from what the verify
> > > > > algorithm verified.
> > > > >
> > > > > And remote contexts and other environmental concerns can be
> > constructed
> > > > > and manipulated so that the verify algorithm sees what the originator
> > > > > signed and thus returns true but the receiver constructs something
> > > > > different.  This can happen by accident, such as when a remote
> > context
> > > > > is updated, or by an opponent, for example by modifying a transmitted
> > > > > document to inject a remote context that is modified between
> > > > > verification and construction.
> > > >
> > > > Some libraries probably double-expand by default, and you're right,
> > > > that's neither efficient nor safe. That deserves some text in the spec
> > > > and a test endpoint that tries to exploit it (e.g. it maps
> > > > `p`=>`http://a.example/p1` <http://a.example/p1> on the first GET,
> > > > `p`=>`http://a.example/p2` <http://a.example/p2> on the 2nd, etc).
> > > >
> > > > If someone finds it much easier to implement their library with
> > > > double-expansion, it can still be safe to double-expand if the
> > > > documentLoader overrides cache pragmas for the duration of a
> > > > verification. By default, the Digital Bazaar stack works with local
> > > > copies anyways so you have to go to some effort to create Manu's
> > > > "footgun".
> > >
> > > The point here is that the algorithms in
> > > https://w3c-ccg.github.io/ld-proofs/ require double expansion so any
> > library
> > > that uses these algorithms to both verify and expand will have to do
> > double
> > > expansion.  And to prevent manipulation all this has to be done within
> > the
> > > trust boundary.  And time and space have to be suspended.
> >
> > A couple messages back:
> > [[
> > Date: Wed, 9 Jun 2021 13:10:28 +0200
> > From: Eric Prud'hommeaux <eric@w3.org>
> > To: Peter Patel-Schneider <pfpschneider@gmail.com>
> > Cc: semantic-web@w3.org
> > Subject: Re: Signing and Verifying RDF Datasets for Dummies (like Me!)
> > Message-ID: <20210609111028.GA6976@w3.org>
> > ]]
> > , I stepped through the algorithm and showed that it takes as input an
> > RDF graph. Given that, I don't see the justification for your
> > assertion that it requires double expansion.
> >
> >
> > > Why require all this extra effort?  If the use of a document format that
> > has
> > > a unique expansion to an RDF graph or dataset is required then none of
> > this
> > > is necessary.
> >
> > I think "unique expansion" means no BNodes. If so, that's a pretty
> > small subset of the RDF in common use.
> >
> >
> > >                 The trust boundary can enclose a much smaller area.  The
> > > document itself is signed so the validation is much closer to the
> > standard
> > > validation for documents.  Recipients can expand at their leisure.  (In
> > any
> > > case some recipients will expand at their leisure, trusting that this is
> > > allowable because the validation succeeded.  The only way to prevent this
> > > would be to encrypt the message so that only trusted libraries can expand
> > > it.)
> >
> > No spec can prevent bad implementations; the best they can do is work
> > with a community to see how to express the spec in a way that enables
> > good implementations.
> >
> > It's trivial for an implementation to expand a JSON-LD document to
> > RDF, pass it to canonicalizer, then hash it and verify the
> > signature. If the sig is good, the app has the expanded doc to use
> > however it sees fit. That document was expanded once in this process.
> >
> >
> > > > This issue is further evidence that a WG product would increase
> > > > security and community understanding around security issues. Most of
> > > > the obvious ways ways to sign JSON-LD introduce this sort of
> > > > vulnerability. No WG leads to any of:
> > > >
> > > > 0. No action: most folks won't consider dereferenced evaluation
> > > >    vulnerabilities present in JSON-LD pipelines that don't include
> > > >    some verification.
> > > >
> > > > 1. standard JWS over JSON-LD doc: this signs the JSON tree but not the
> > > >    RDF expansion.
> > > >
> > > > 2. Homegrown signature stacks: likely to include atomic operations
> > > >    that separate verification from expansion (for e.g. populating a
> > > >    store) is subject to your timing attack.
> > > >
> > > > A WG product can raise awareness of these issues for these issues
> > > > across all JSON-LD pipelines (or any dereferenced evaluation
> > > > pipelines) and provide recipes and tools for securing them.
> > >
> > > If you want to send a JSON-LD document, send it as a signed document.
> >
> > Do you mean (here and following) to just sign the bytes of the JSON-LD
> > document? You were just arguing that the double expansion of a JSON-LD
> > document is a flaw in RDF Signatures. Now it appears you're suggesting
> > that folks sign the JSON, which takes zero precaution against somone
> > changing the context at any point. It appears you have much looser
> > requirements for the do-nothing proposal than for the charter.
> >
> 
> Signing the surface form does at least make clear exactly what was the
> human-facing content being signed, including its external dependencies.
> Same goes for signing raw rdfa, microdata, etc

Sure, but it's exactly counter to PFPS's desire to prevent changes to
an @context from skewing the RDF assertions. If you care about that,
how could throwing away that use case be better?

RDF Signatures don't solve the user presentation problem. That's
probably solvable, at least to a degree that could provide evidence
in court, but data signature is much simpler and achievable with the
tools on hand.


> Dan
> 
> 
> 
> 
> 
> > >
> > If
> > > you want to send an RDFa document, send it as a signed document.  If you
> > > want to send a Turtle document, send it as a signed document.  If you
> > want
> > > to send an RDF graph or dataset send it as a signed N-Triples or N-Quads
> > > document.  Don't send JSON-LD or RDFa or Turtle and some other stuff.  If
> > > you want to canonicalize an RDF graph or dataset, canonicalize it.  If
> > you
> > > want to canonicalize and send, send signed a canonicalized N-Triples or
> > > N-Quads document.   There is enough here for a WG.
> >
> > I suspect you're restricting the charter here but I'm not sure I
> > understand the proposal. You'll have to be more explicit about what
> > you intend to rule out.
> >
> >
> > > If you want to create a vocabulary for proofs and other verification data
> > > create a vocabulary.   There is enough here for another WG.
> > >
> > > peter
> > >
> > >
> > >
> > > peter
> > >
> > >
> >
> >
Received on Sunday, 13 June 2021 09:14:24 UTC