Re: Chartering work has started for a Linked Data Signature Working Group @W3C from Eric Prud'hommeaux on 2021-05-23 (semantic-web@w3.org from May 2021)

From: Eric Prud'hommeaux <eric@w3.org>
Date: Sun, 23 May 2021 15:06:49 +0200
To: Peter Patel-Schneider <pfpschneider@gmail.com>
Cc: Dan Brickley <danbri@danbri.org>, Aidan Hogan <aidhog@gmail.com>, semantic-web@w3.org
Message-ID: <20210523130649.GC23721@w3.org>
On Fri, May 21, 2021 at 08:21:59AM -0400, Peter Patel-Schneider wrote:
> On Fri, 2021-05-21 at 13:48 +0200, Eric Prud'hommeaux wrote:
> > On Fri, May 21, 2021 at 07:01:02AM -0400, Peter Patel-Schneider
> > wrote:
> > > On Fri, 2021-05-21 at 09:20 +0100, Dan Brickley wrote:
> > > > 
> > > > 
> > > > On Fri, 21 May 2021 at 00:34, Peter Patel-Schneider <
> > > > pfpschneider@gmail.com> wrote:
> > > > > On Thu, 2021-05-20 at 18:58 -0400, Aidan Hogan wrote:
> > > > > > [...]
> > > > > > 
> > > > > > RDF Dataset canonicalisation has indeed undergone review by
> > > > > > trained
> > > > > > mathematicians as mentioned before, but to the best of my
> > > > > knowledge,
> > > > > > the 
> > > > > > people involved (those findable from the explainer) are not
> > > > > security
> > > > > > or 
> > > > > > cryptography experts. Which security and cryptography
> > > > > > engineers
> > > > > have 
> > > > > > reviewed which parts? It would be good to see input from such
> > > > > experts
> > > > > > regarding (2) and particularly (3).
> > > > > > 
> > > > > 
> > > > > Indeed.  As far as I know [3], i.e., the idea of augmenting
> > > > > graphs
> > > > > while signing and removing the augmentations while verifying
> > > > > isn't a
> > > > > standard part of security and cryptography.   Which experts
> > > > > have
> > > > > signed
> > > > > off on this?
> > > > > 
> > > > 
> > > > 
> > > > On this detail, does it recurse reliably? 
> > > > 
> > > > If Ale writes some RDF, Brin signs it to assure basic integrity
> > > > of the
> > > > communication, publishes the result, and then a couple days later
> > > > Cary
> > > > signs it to indicate institutional endorsement of the original
> > > > claims,
> > > > etc. Are there any cases where manipulating an additional signing
> > > > could
> > > > mess with embedded earlier signings, to malicious ends?
> > > > 
> > > > Dan
> > > 
> > > Indeed, my reading of 
> > > https://w3c-ccg.github.io/ld-proofs/#algorithms
> > > leads me to believe that recursively signed graphs cannot be
> > > verified.
> > > I think the intent of recursive signing is slightly different than
> > > your
> > > gloss - the second signer is not signing the original graph but is
> > > signing the signed graph, perhaps to lend their approval of the
> > > first
> > > signing.
> > > 
> > > Ale writes G.
> > > Brin signs G and adds its own proof triples, resulting in G'.
> > > Cary takes G', removes the proof triples in it to get G, and uses
> > > Brin's proof triples to verify that Brin signed G.
> > > Cary takes G' and adds its own proof triples, resulting in G''.
> > > Dave takes G'', removes the proof triples in G'' to get G, and
> > > tries to
> > > use Cary's proof triples to verify that Cary signed G.  
> > > But Cary did not sign G so the verification fails!
> > > 
> > > I believe that the described process for manipulation of the graph
> > > permits an opponent to inject unsigned content into signed graphs
> > > and
> > > still have the verification succeed.
> > 
> > Of course, this is just a limitation of how you can sign graph.
> > Nothing prevents you from creating a second graph which references
> > the first, and signing that second graph, etc. I can sign your
> > signature; I just can't sign it in the same document as your
> > signature.
> 
> Signing remote objects has its own problems, as far as I can tell. 
> What happens if the remote object is modified?

You break the signatue, exactly as you should.

 
>                                                 What happens if the
> remote object is reverted to an earlier version?

That sounds suspiciously like modification.


>                                                   You might be able to
> include a signing of the remote object, but that's not part of the
> current algorithms as far as I can tell.

If I c14nize/hash Doc1 to get Hash1, I can compose Sig1 with PrivKey1
(which has exactly one proof verifying that the holder of PrivKey1
signed a wee little graph with Hash1, some purpose, and some hints to
help a recipient verify rather than stumbling around in the dark until
stumbling on the recipe to produce that signatuure). You can then
c14nize/hash Sig1 to get Hash2 and combine that with PrivKey2 to
produce Sig2.

I admit that signing functionality could, instead of saying it throws
away an existing proof, outright reject anything with an existing
proof, but that's a question of tuning the API to produce least
surprise. There's no cryptographic difference between the signing
function ensuring that the signature's hash excludes a proof vs.
the caller disposing of it beforehand.

I don't understand why a Proof Chain isn't sufficient for signing a
signature, but let's say it's because you want to predicate that nth
signature with some contextualizing grah (e.g. Claire think's Bob's
crazy because he endorses Alice's claim that the moon is made of green
cheese). With existing Linked Data Proofs (+1 to RDF Proofs), you can
do that in 6 docs (1:MoonIsCheeze, 2:AlicezSig(1), 3:BobEndorses(2),
4:BobzSig(3), 5:BobzCrazCause(4), 6:ClairesSig(5)), though you may get
that down to 3 by nesting 2 in 1, 4 in 3, and 6 in 5.


> You could sign named graphs in the same dataset, of course, and I think
> that that would work reasonably well (as it is quite similar to signing
> parts of an email message), except that I think you again run into
> problems when you want to do recursive signing because RDF datasets
> don't have named datasets.  But, again, this isn't part of the
> algorithm in Linked Data Proofs 1.0.

This tool chain already solves a bunch of problems. My purposes are
met by simply signing external docs. Do I need that standardized? It
would certainly help.

Could we go further and emulate this nesting (1-6 above) by minting
fresh bnode graph labels in datasets? Probably. Do we need to solve
that before signing external docs can become a standard? No. Would it
be nice to look around a few corners to see if the spec we create
today will meet tomorrow's needs? Absolutely, but there's plenty of
existing practice and adoption to allow the WG to confidently
standardize the first part and pretty confidently undertake the
second.


> So I'm waiting for some security expert sign-off on the entirety of the
> proof algorithms in Linked Data Proofs 1.0, and also for an open-source
> reference implementation of the algorithms.   I don't think that the WG
> should start until both of these have been made available.

I'm not sure what you mean by "reference implementation". Do you mean
that if it differs from the spec that the reference implementation
defines behavior (i.e. everyone must be bug-compatible)?

Voila the licenses for Manu's list of URDNA implementations:
[[
JS - https://github.com/digitalbazaar/rdf-canonize - new BSD
C++ - https://github.com/digitalbazaar/rdf-canonize-native - new BSD
Java - https://github.com/setl/rdf-urdna - Apache
Go - https://github.com/piprate/json-gold - Apache
Python - https://github.com/digitalbazaar/pyld - new BSD
Rust - https://github.com/digitalbazaar/rdf-canonize-rs - BSD
Ruby - https://github.com/ruby-rdf/rdf-normalize - "free and unencumbered"
]]


https://github.com/digitalbazaar/jsonld-signatures has "BSD 3-Clause"
(so many licenses).  That code leans on
https://github.com/digitalbazaar/crypto-ld (BSD). (Signing algorithms
are mostly concerned with base64-encoding the right ascii-ified byte
arrays.)


> peter
>
>
Received on Sunday, 23 May 2021 13:06:57 UTC