- From: Manu Sporny <msporny@digitalbazaar.com>
- Date: Sun, 6 Jun 2021 16:52:07 -0400
- To: semantic-web@w3.org
On 6/4/21 2:19 PM, Peter Patel-Schneider wrote: > There is an easy escalation-of-privileges attack on loading a context from > a file. All the attacker needs is create and write access to any part of > the filesystem. If the attacker has create/write access to any part of your filesystem JSON-LD Context files are the least of your concerns. At that point, the attacker can just switch out your shell and openssl binaries with compromised ones, use your secrets to get direct access to your databases, and wreak all sorts of havoc that make the network-based attacks we've been talking about look quaint in comparison. I'm happy to keep talking through attacks and mitigations, but the attack surface keeps changing and we've now gone squarely into "assume a fully compromised system" territory. > So the implementation of the security primitives use some special sauce > that is not specified in the algorithms? That's not implementing the > algorithms, but something else, which might be more secure or less secure. Loading a JSON-LD Context file is not a security primitive. SHA-256 and EdDSA could be viewed as security primitives. One might even argue that RDF Dataset canonicalization is a security primitive. Using those things to construct and transmit a digital signature is typically referred to as a security protocol. Some security protocols assume you have safe inputs, others don't. It seems like you would like the LDI security protocol to be extended to define how one might protect inputs into the algorithms. That's a fine thing to desire, and I'd even go as far as saying that we should probably say something about that in the LDI spec... in the Security Considerations section and then point to that from the algorithms. Again, this is the sort of thing that's discussed in a WG... if you'd like, I can add an issue marker in the input document to say that the group should consider this? Would that address your concern? > Yes, indeed, I am certainly frustrated that there is no reference > implementation of the algorithms. I am still unable to determine just how > linked data documents are to be signed and verified. I'm even unable to > determine what a consumer is supposed to be able to determine when a signed > linked data document is verified. I've tried to explain those things in detail to you over the past two weeks. How can I help further? What specifically are you confused about? > I'm also unclear as to whether HTTP contexts actually are bad practice. > Example 6 in https://w3c-ccg.github.io/ld-proofs/ appears to use a remote > context in a signed linked document. I presume that you are referring to this "remote context": https://w3id.org/security/suites/ed25519-2020/v1 When a JSON-LD Processor sees that URL, it will call out to its "document loader" subsystem. That subsystem will then load that URL from a secure location (like a local read-only file system), instead of going out over HTTP to fetch the context file. > I'm coming to the conclusion that there is no way to reliably sign and > verify JSON-LD documents as linked data. Well, that's a very strange conclusion to come to... I can understand that you're confused... but the conclusion should be "I don't know", not "there is no way to reliably sign and verify". > About all I'm willing to guess is that it is likely possible to reliably > sign and verify NQUADS documents that do not contain relative IRIs or blank > nodes because canonicalization reduces to canonicalizing IRIs and strings, > removing comments, and then sorting the file and eliminating duplicate > lines. We've been able to do the above since 2003: https://link.springer.com/chapter/10.1007/978-3-540-39718-2_24 Doing so doesn't address the problem that the LDS WG is being chartered to do, which is: 1. Define a generalized canonicalization mechanism for abstract RDF Datasets. 2. Define a way of serializing and hashing the canonicalized form from #1. 3. Define a way of expressing digital signatures (proofs) using the hashed form of the RDF Dataset from #2. #1 has multiple solutions with formal proofs. #2 utilizes known data formats (NQuads) and known cryptographic hashing functions. #3 has multiple implementations that do not depend on new cryptographic primitives and with protocols that are easy to analyse (and have been). What of the items above are you still unconvinced of, and at what point would you be convinced? -- manu -- Manu Sporny - https://www.linkedin.com/in/manusporny/ Founder/CEO - Digital Bazaar, Inc. blog: Veres One Decentralized Identifier Blockchain Launches https://tinyurl.com/veres-one-launches
Received on Sunday, 6 June 2021 20:52:37 UTC