Re: Signing and Verifying RDF Datasets for Dummies (like Me!) from Eric Prud'hommeaux on 2021-06-14 (semantic-web@w3.org from June 2021)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Mon, 14 Jun 2021 23:57:45 +0200
To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
Cc: semantic-web@w3.org
Message-ID: <20210614215745.GM8464@w3.org>
On Sun, Jun 13, 2021 at 04:26:57PM -0400, Peter F. Patel-Schneider wrote:
> 
> On 6/13/21 10:45 AM, Eric Prud'hommeaux wrote:
> > On Sun, Jun 13, 2021 at 08:55:34AM -0400, Peter F. Patel-Schneider wrote:
> > > On 6/12/21 6:11 PM, Eric Prud'hommeaux wrote:
> > > > On Fri, Jun 11, 2021 at 07:37:57AM -0400, Peter F. Patel-Schneider wrote:
> > > > > On 6/11/21 3:33 AM, Eric Prud'hommeaux wrote:
> > > > > > On Wed, Jun 09, 2021 at 01:45:10PM -0400, Peter Patel-Schneider wrote:
> > > > > [...]
> > > > > No spec can prevent bad implementations; the best they can do is work
> > > > > with a community to see how to express the spec in a way that enables
> > > > > good implementations.
> > > > > 
> > > > > It's trivial for an implementation to expand a JSON-LD document to
> > > > > RDF, pass it to canonicalizer, then hash it and verify the
> > > > > signature. If the sig is good, the app has the expanded doc to use
> > > > > however it sees fit. That document was expanded once in this process.
> > > > > 
> > > > > 
> > > But this isn't how the algorithms in https://w3c-ccg.github.io/ld-proofs/
> > > work.   Perhaps this change would alleviate some of the problems I have
> > > identified.
> > Can you state where the aglorithm dicates double-expansion? My reading
> > of the algorithm is that it starts with an RDF graph and never hints
> > at any sort of expansion after that.
> 
> 
> The proof verification algorithm takes a signed linked data document and
> produces a boolean result.   Internally there is an expansion.   For the
> recipient to get the RDF graph or dataset requires a second expansion.

Ahh, it does indeed say that above the bullet list. I'd derived from
<https://json-ld.github.io/rdf-dataset-canonicalization/spec/#canonicalization>
that it took as input an "input dataset", i.e. "The abstract RDF
dataset that is provided as input to the algorithm."

At any rate, this is pretty trivially addressed by making it return a
dataset or an error.


> > > > > > This issue is further evidence that a WG product would increase
> > > > > > security and community understanding around security issues. Most of
> > > > > > the obvious ways ways to sign JSON-LD introduce this sort of
> > > > > > vulnerability. No WG leads to any of:
> > > > > > 
> > > > > > 0. No action: most folks won't consider dereferenced evaluation
> > > > > >       vulnerabilities present in JSON-LD pipelines that don't include
> > > > > >       some verification.
> > > > > > 
> > > > > > 1. standard JWS over JSON-LD doc: this signs the JSON tree but not the
> > > > > >       RDF expansion.
> > > > > > 
> > > > > > 2. Homegrown signature stacks: likely to include atomic operations
> > > > > >       that separate verification from expansion (for e.g. populating a
> > > > > >       store) is subject to your timing attack.
> > > > > > 
> > > > > > A WG product can raise awareness of these issues for these issues
> > > > > > across all JSON-LD pipelines (or any dereferenced evaluation
> > > > > > pipelines) and provide recipes and tools for securing them.
> > > > > If you want to send a JSON-LD document, send it as a signed document.
> > > > Do you mean (here and following) to just sign the bytes of the JSON-LD
> > > > document? You were just arguing that the double expansion of a JSON-LD
> > > > document is a flaw in RDF Signatures. Now it appears you're suggesting
> > > > that folks sign the JSON, which takes zero precaution against somone
> > > > changing the context at any point. It appears you have much looser
> > > > requirements for the do-nothing proposal than for the charter.
> > > If the use case requires sending JSON-LD documents around, then send signed
> > > JSON-LD documents and accept the issues with expansion to an RDF graph or
> > > dataset.
> > Just to be clear; your objections to the charter include an attack
> > based on changing the context in between a verification and acting on
> > the expansion. Instead, you advocate offering no signature of the
> > expansion.
> No.  If you want to securely transmit an RDF graph or dataset send a signed
> N-Triples or N-Quads document.  If you want to do something else you can
> send a different kind of document, signed if you prefer.

OK, I think I understand that you want to restrict RDF Signatures to
be transmitted over any format which requires dereference of some
other resource to produce an RDF dataset.


> > > > > you want to send an RDFa document, send it as a signed document.  If you
> > > > > want to send a Turtle document, send it as a signed document.  If you want
> > > > > to send an RDF graph or dataset send it as a signed N-Triples or N-Quads
> > > > > document.  Don't send JSON-LD or RDFa or Turtle and some other stuff.  If
> > > > > you want to canonicalize an RDF graph or dataset, canonicalize it.  If you
> > > > > want to canonicalize and send, send signed a canonicalized N-Triples or
> > > > > N-Quads document.   There is enough here for a WG.
> > > > I suspect you're restricting the charter here but I'm not sure I
> > > > understand the proposal. You'll have to be more explicit about what
> > > > you intend to rule out.
> > > I'm ruling out complex schemes where a document is sent unprotected along
> > > with a signature of something that the document sometimes turns into, for
> > > example sending JSON-LD along with a signature of the RDF graph or dataset
> > > that it expands to when the signer expands it.   This, to me, ends up with a
> > > complex verification system that has a large attack space.
> > If I convince you that the algorithm does not dictate
> > double-expansion, or if you convince me that it does and we get it
> > fixed so it doesn't, would that change your position?
> There are other problems with sending a document and a signature of what it
> might expand to.  Senders could disclaim that they sent the document by
> changing the remote context, for example.   The signature depends on a
> complex process.   My view is that it is much better to sign a document by
> signing the document's octet sequence, not something else.

Since JSON-LD can have multiple contexts, one could practically
protect against non-repudiation by including the claim-specific
context inline and referencing the standard sec: context by
URL. Digital Bazaar's implentations already supports that requirement
with their staticDocumentLoader.


> > If your objection is that a cryptographic signature should not be
> > appended to a readable document, you've ruled out most signature
> > systems I can think of (XML DSig, JWS, PGP Mime). That's just how they
> > work.
> > 
> > [following are broken out to be easy to reply to]
> > 
> > If you're OK with appending a signature to a document, how about
> > appending a signature to an N-Quads representation of a graph?
> Oh, yes, this is fine.   N-Quads has a unique expansion to triples. (Module
> case normalization of language tags.)

Crap, I missed a step here. I wanted to explicitly ask about adding a
signature to the RDF graph (à la the examples in
<https://janeirodigital.github.io/rdf-sig-playground/index?manifestURL=examples/toy.yaml>
). I think your answer about JSON-LD below says you're OK with
signatures in the graph.


> > If you're OK with appending a sig to an N-Quads graph, how about other
> > representations of the same graph?
> As long as the expansion always results in an isomorphic RDF graph or dataset.
> > 
> > What if those other representations include JSON-LD?
> 
> 
> The expansion of JSON-LD to an RDF graph or dataset does not always result
> in an isomorphic graph or dataset.   Aside from remote contexts there are
> local file system contexts, relative IRIs, and issues having to do with JSON
> itself.
> 
> [...]
> 
> 
> peter
> 
>
Received on Monday, 14 June 2021 21:58:00 UTC