Re: Signing and Verifying RDF Datasets for Dummies (like Me!)

On Tue, 2021-06-08 at 23:13 +0200, Eric Prud'hommeaux wrote:
> On Mon, Jun 07, 2021 at 08:31:17PM -0400, Peter F. Patel-Schneider
wrote:
> > On Mon, 2021-06-07 at 22:49 +0200, Eric Prud'hommeaux wrote:
> > > On Mon, Jun 07, 2021 at 03:37:44PM -0400, Peter Patel-Schneider
> > > wrote:
> > 
> > [..]
> > 
> > > A third related to changing the meaning of JSON-LD documents by
> > > changing the @context. This isn't related to signatures, and if
> > > anything, signatures give you a tool to prevent that because
you've
> > > signed a the resulting document and if someone changes the the
> > > @context under you, you can't verify the signature.
> > >
> > > Those were, afaict, the only substantial critiques. Most were of
the
> > > form "if you change X, the hash changes and the signature breaks"
to
> > > which the reply is "by design".
> > 
> > Remote contexts are indeed problematic for JSON-LD documents.  They
can
> > cause failures in both directions.  If the remote context is
changed the
> > deserialization of the document may change, invalidating signatures
of
> > documents that use the remote context.  But I believe that
attackers can
> > also use remote contexts to change signed JSON-LD documents in a
way that
> > validation by recipients will succeed but when the recipient
deserializes
> > the document they end up with an RDF dataset that is not isomorphic
to the
> > dataset signed by the originator.  I believe that this is the case
even if
> > the orignal signed JSON-LD document did not use remote contexts.
> 
> Do you agree that in order to do so, they'd have to expand the
> document more than once, and carry the conclusion of a valid
signature
> over from the first expansion?

For the receiver seeing a different graph than what the sender signed
this double expansion is needed.  However, double expansion is required
given the algorithms in https://w3c-ccg.github.io/ld-proofs/

> The main thing RDF signatures is doing is canonicalizing and hashing
> pairs of a document and a proof. The hashing technology is standard
> fair used in lots of tech today. The canonicalization could only be
> attacked if it produced the same result from different, non-
isomorphic
> graphs. The focus here seems to be based on tricking people into
> signing something different from what they thought they were signing,
> or presenting data different from what was signed.
> 
> I don't think one could say these are different in kind from either:
> 1. any other use of JSON-LD
> 2. the use of any tech where a remote doc tweaks the semantics (DTD).

Indeed several of the problems I have pointed out involve manipulating
the environment so that the receiver ends up with an RDF dataset that
is different from what was verified.  This is somewhat similar to the
issues you point out, but having the problem affect cryptographic
signatures raises it to a much higher level.  Consider, for example, if
what was signed was not a separate representation of the dataset but
the actual JSON-LD document itself.   This would not get you canonical
signatures but the signatures would be verifiable even if the remote
resources changed or the document used relative IRIs that ended up
being based on the document's location.

[...]


> > The Web GUI you put up at
> > https://janeirodigital.github.io/rdf-sig-playground/index was
useful but it
> > doesn't take JSON-LD and appears to produce quite different output.
> 
> The default manifest file loads an example which creates a VC. RDF
> Signatures are, as indicated in the proposed charter, a framework for
> creating protocols like VCs and like what Manu signed. I stuck a
> <select/> at the top to make it generate a proof like Manu's.
> 
> With this manifest:
> 
>
https://janeirodigital.github.io/rdf-sig-playground/index?manifestURL=examples/toy.yaml
> 
> you should be able to reproduce (and step through) Manu's example
> without ever dipping into JSON-LD. It doesn't accept his key pair
> because key formats are beyond my ken (Manu, a PR would be welcomed).
> In principle, if it did, you'd see a proofValue of
> 
>
'z4oey5q2M3XKaxup3tmzN4DRFTLVqpLMweBrSxMY2xHX5XTYVQeVbY8nQAVHMrXFkXJpmE
cqdoDwLWxaqA3Q1geV6'
> instead of my
>
'z5BK4yiC7Ee85EFjDYG3qSnRGrW7DrcmmMaJwEULMJknAN7ZmxTCcGZVthe71UMKreKaKb
Vx9rBWV3BkiWhxNpCXp'

One of the reasons to step into JSON-LD is to examine the problems
caused by the use of JSON-LD.

Is it the case that your system is supposed to conform to the
algorithms in https://w3c-ccg.github.io/ld-proofs/ with the proof
triples corresponding to the proof options?  I guess I can see this but
it does seem odd to allow any triples in the signature block.

When I use the output of your implementation as the input to it, i.e.,
I try to sign an RDF graph that contains a signature.  I get a failure
as I expected but not for the reason I expected.  Your implementation
appears to have some explicit check for multiple signatures but I don't
see why in principle multiple signatures are not allowed.  The
algorithms in https://w3c-ccg.github.io/ld-proofs/ don't have
exclusions for RDF datasets that already contain signatures.


peter

Received on Wednesday, 9 June 2021 00:22:15 UTC