Re: SPARQL Security - Best Practices? from Damian Steer on 2008-09-05 (semantic-web@w3.org from September 2008)

From: Damian Steer <pldms@mac.com>
Date: Fri, 05 Sep 2008 12:33:10 +0100
To: Marco Brandizi <brandizi@ebi.ac.uk>, Semantic Web <semantic-web@w3.org>
Message-ID: <48C118F6.9080804@mac.com>

Marco Brandizi wrote:
> Damian Steer wrote:
> 
>>
>> So, as you suggest, we use graphs as the basis. We then mix in a 
>> function P(A,G) => boolean, which tells us whether user A has 
>> permission to query G. (or, indeed, to write or delete)
>>
> [...]
>>
>> SELECT ?privateinfo WHERE { :damian :knows ?privateinfo }
>>
>> becomes
>>
>> SELECT ?privateinfo WHERE { GRAPH ?g { :damian :knows ?privateinfo } 
>> FILTER (?g = <allowed> || ?g = <alsoallowed>) } # please forgive my 
>> syntax here
>>
> 
> Hi,
> 
> do you have some strategy to manage a use case like "N results exist, 
> but you are authorized to see only k of them?".

As I mentioned, we don't do the filtering in the query (for other 
reasons) so it would be possible. However you're right that it would be 
difficult to work that out otherwise. My particular (related) concern is 
detecting deliberate attempts to access private data. Many of those 
would be indistinguishable from queries with no results.

> Moreover, I wonder if someone have ideas about mixing access to 
> explicitly declared triples and inferred statements. For instance, if a 
> triple is entailed by other triples the user hasn't access to, one 
> should decide if the inferred triple is accessible (e.g.: is at the same 
> level of details of the premise) or not (e.g.: the consequence 
> represents an aggregate information).

If you're do the inferencing work each time a query comes in it doesn't 
seem much of a problem, but that could get expensive :-)

I remember Steve Harris had a system for truth maintenance where the 
dependencies between groups of statements were tracked. So graph 1: 
damian is a mammal depends on graph 2 which mentions damian is human and 
graph 3: humans subclassof mammal. If 2 or 3 change we see that 1 needs 
to be recalculated.

Something similar could work for your case: look at the provenance of 
the graphs. I attended a symposium [1][2] around this area, and would 
recommend poking around what they've been doing. Steve's system is 
rather like what they called 'colour provenance', if I understood correctly.

Damian

[1] <http://www.nesc.ac.uk/esi/events/894/>
[2] <http://wiki.esi.ac.uk/ProvenanceInDatabases>

Received on Friday, 5 September 2008 11:34:31 UTC