Conditional Requests to resolve semaphore and confidentiality concerns from Kjetil Kjernsmo on 2020-01-16 (public-sparql-12@w3.org from January 2020)

From: Kjetil Kjernsmo <kjetil@kjernsmo.net>
Date: Thu, 16 Jan 2020 13:31:28 +0100
To: "SPARQL 1.2 Community Group" <public-sparql-12@w3.org>
Message-ID: <1979145.1LmWpcyDzi@owl>
Hi all,

I'm working on the Solid project[1], where we use Semantic Web technologies 
intensively. For now, SPARQL is only used on the server side to update 
documents, and not using the SPARQL Protocol, a SPARQL 1.1 Update query is 
passed as the body of a PATCH request[2]. 

We have an open issue on the level of SPARQL 1.1 Update support Solid should 
require[3], and I have been working on two points where there are some 
tensions. I have a rather involved proposal to address them both in a 
backwards compatible way, that I want to air with you. The TL;DR is: We 
should support issue 63 [4] and introduce a conditional request header into 
HTTP.

These are the issues:

1) A semaphore mechanism for updates.

Imagine a room crowded by thousands of people who co-edit a document in real 
time. Neither locking the document for every write nor using a simple ETag 
for the entire document will be sufficiently scalable. We are obviously looking 
into CRDT's, but lets not go there for now.

Concretely, say that client 1 goes:

DELETE DATA { <foo> <baz> "Dahut" } ;
INSERT DATA { <foo> <baz> "Bar" }

independently, client 2 goes 

DELETE DATA { <foo> <baz> "Dahut" } ;
INSERT DATA { <foo> <baz> "Foobar" }

before the first client as finished. In that case, the Solid implementation 
would return a 409 Conflict to the second client. This comes from a suggestion 
TimBL made in a Design Issue[5] that we introduce a semaphore mechanism. I 
took this issue to [6]. I have been conflicted myself over this solution 
because of the tension between protocol and query language levels, and I was 
also not able to answer the comments there.

I then came to realize that an ability to see if a DELETE fails or succeeds 
has other implications for Solid too, as we have a permission system with 
Read, Write, Append and Control.

Ideally, DELETE should only require Write permission, but if you can infer 
from the status code whether a triple existed, then arguably, it should 
require a Read permission. So, the second issue is, broadly

2) A mechanism to communicate status from write queries safely.

To put it into an example, imagine a malicious user "Mallory": Mallory is 
authorized to write, but not to read, and does not particularly care if he 
destroys things, he just wants to check if certain triples were there. In 
that case, he can send the query

DELETE DATA {
  <alice/profile#me> ex:age 14 . 
}

In SPARQL 1.1, Mallory cannot tell whether the triple was there since it will 
always succeed, so he can't tell that Alice was in fact 14 years old. So, 
DELETE with Write is OK. With semaphore mechanism we currently implement, 
Mallory can tell that Alice is 14, so it would be a breach of confidentiality 
to only require Write. It is therefore important to be careful not to reveal 
information when making updates.

Then, I found that Michael Rauch has a proposal around this in [4]. In 
particular, I liked Richard Cyganiak's take on this: Any such information 
should essentially be a projection. With that, we can ensure that Read 
permission is required to access any projected variable binding. To have that 
single point would be very useful. So, I strongly support that proposal.

Then, what should we do on the protocol level to support our semaphore? 

We should introduce another Conditional Request header, nominally "If-
Variable" into HTTP. This is orthogonal to SPARQL, but the idea is that it 
names a variable, and if the Effective Boolean Value of that variable is 
false, the request will fail atomically with a 412 Precondition Failed. 

Whenever the semaphore mechanism is needed, the query needs to formulated 
with the REPORT mechanism as suggested in [4], have Read as well as Write 
permission and set the If-Variable header.

[1] https://solidproject.org/
[2] https://github.com/w3c/sparql-12/issues/104
[3] https://github.com/solid/specification/issues/125
[4] https://github.com/w3c/sparql-12/issues/63
[5] https://www.w3.org/DesignIssues/ReadWriteLinkedData.html
[6] https://github.com/w3c/sparql-12/issues/60
[7] https://tools.ietf.org/html/rfc7232

What do you all think?

Cheers,

Kjetil
Received on Thursday, 16 January 2020 12:32:04 UTC