Re: Signing and Verifying RDF Datasets for Dummies (like Me!) from Peter F. Patel-Schneider on 2021-06-13 (semantic-web@w3.org from June 2021)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Sun, 13 Jun 2021 16:26:57 -0400
To: Eric Prud'hommeaux <eric@w3.org>
Cc: semantic-web@w3.org
Message-ID: <e0bc7814-0b8e-9fd1-95ac-92ed3d3b84d9@gmail.com>
On 6/13/21 10:45 AM, Eric Prud'hommeaux wrote:
> On Sun, Jun 13, 2021 at 08:55:34AM -0400, Peter F. Patel-Schneider wrote:
>> On 6/12/21 6:11 PM, Eric Prud'hommeaux wrote:
>>> On Fri, Jun 11, 2021 at 07:37:57AM -0400, Peter F. Patel-Schneider wrote:
>>>> On 6/11/21 3:33 AM, Eric Prud'hommeaux wrote:
>>>>> On Wed, Jun 09, 2021 at 01:45:10PM -0400, Peter Patel-Schneider wrote:
>>>> [...]
>>>> No spec can prevent bad implementations; the best they can do is work
>>>> with a community to see how to express the spec in a way that enables
>>>> good implementations.
>>>>
>>>> It's trivial for an implementation to expand a JSON-LD document to
>>>> RDF, pass it to canonicalizer, then hash it and verify the
>>>> signature. If the sig is good, the app has the expanded doc to use
>>>> however it sees fit. That document was expanded once in this process.
>>>>
>>>>
>> But this isn't how the algorithms in https://w3c-ccg.github.io/ld-proofs/
>> work.   Perhaps this change would alleviate some of the problems I have
>> identified.
> Can you state where the aglorithm dicates double-expansion? My reading
> of the algorithm is that it starts with an RDF graph and never hints
> at any sort of expansion after that.


The proof verification algorithm takes a signed linked data document and 
produces a boolean result.   Internally there is an expansion.   For the 
recipient to get the RDF graph or dataset requires a second expansion.

>
>>>>> This issue is further evidence that a WG product would increase
>>>>> security and community understanding around security issues. Most of
>>>>> the obvious ways ways to sign JSON-LD introduce this sort of
>>>>> vulnerability. No WG leads to any of:
>>>>>
>>>>> 0. No action: most folks won't consider dereferenced evaluation
>>>>>       vulnerabilities present in JSON-LD pipelines that don't include
>>>>>       some verification.
>>>>>
>>>>> 1. standard JWS over JSON-LD doc: this signs the JSON tree but not the
>>>>>       RDF expansion.
>>>>>
>>>>> 2. Homegrown signature stacks: likely to include atomic operations
>>>>>       that separate verification from expansion (for e.g. populating a
>>>>>       store) is subject to your timing attack.
>>>>>
>>>>> A WG product can raise awareness of these issues for these issues
>>>>> across all JSON-LD pipelines (or any dereferenced evaluation
>>>>> pipelines) and provide recipes and tools for securing them.
>>>> If you want to send a JSON-LD document, send it as a signed document.
>>> Do you mean (here and following) to just sign the bytes of the JSON-LD
>>> document? You were just arguing that the double expansion of a JSON-LD
>>> document is a flaw in RDF Signatures. Now it appears you're suggesting
>>> that folks sign the JSON, which takes zero precaution against somone
>>> changing the context at any point. It appears you have much looser
>>> requirements for the do-nothing proposal than for the charter.
>> If the use case requires sending JSON-LD documents around, then send signed
>> JSON-LD documents and accept the issues with expansion to an RDF graph or
>> dataset.
> Just to be clear; your objections to the charter include an attack
> based on changing the context in between a verification and acting on
> the expansion. Instead, you advocate offering no signature of the
> expansion.
No.  If you want to securely transmit an RDF graph or dataset send a signed 
N-Triples or N-Quads document.  If you want to do something else you can send 
a different kind of document, signed if you prefer.
>
>>>> you want to send an RDFa document, send it as a signed document.  If you
>>>> want to send a Turtle document, send it as a signed document.  If you want
>>>> to send an RDF graph or dataset send it as a signed N-Triples or N-Quads
>>>> document.  Don't send JSON-LD or RDFa or Turtle and some other stuff.  If
>>>> you want to canonicalize an RDF graph or dataset, canonicalize it.  If you
>>>> want to canonicalize and send, send signed a canonicalized N-Triples or
>>>> N-Quads document.   There is enough here for a WG.
>>> I suspect you're restricting the charter here but I'm not sure I
>>> understand the proposal. You'll have to be more explicit about what
>>> you intend to rule out.
>> I'm ruling out complex schemes where a document is sent unprotected along
>> with a signature of something that the document sometimes turns into, for
>> example sending JSON-LD along with a signature of the RDF graph or dataset
>> that it expands to when the signer expands it.   This, to me, ends up with a
>> complex verification system that has a large attack space.
> If I convince you that the algorithm does not dictate
> double-expansion, or if you convince me that it does and we get it
> fixed so it doesn't, would that change your position?
There are other problems with sending a document and a signature of what it 
might expand to.  Senders could disclaim that they sent the document by 
changing the remote context, for example.   The signature depends on a complex 
process.   My view is that it is much better to sign a document by signing the 
document's octet sequence, not something else.
>
> If your objection is that a cryptographic signature should not be
> appended to a readable document, you've ruled out most signature
> systems I can think of (XML DSig, JWS, PGP Mime). That's just how they
> work.
>
> [following are broken out to be easy to reply to]
>
> If you're OK with appending a signature to a document, how about
> appending a signature to an N-Quads representation of a graph?
Oh, yes, this is fine.   N-Quads has a unique expansion to triples. (Module 
case normalization of language tags.)
>
> If you're OK with appending a sig to an N-Quads graph, how about other
> representations of the same graph?
As long as the expansion always results in an isomorphic RDF graph or dataset.
>
> What if those other representations include JSON-LD?


The expansion of JSON-LD to an RDF graph or dataset does not always result in 
an isomorphic graph or dataset.   Aside from remote contexts there are local 
file system contexts, relative IRIs, and issues having to do with JSON itself.

[...]


peter
Received on Sunday, 13 June 2021 20:27:20 UTC