Re: SPARQL Security - Best Practices? from Jacek Kopecky on 2008-09-02 (semantic-web@w3.org from September 2008)

From: Jacek Kopecky <jacek.kopecky@sti2.at>
Date: Tue, 02 Sep 2008 23:44:07 +0200
To: Richard Newman <rnewman@twinql.com>
Cc: Damian Steer <pldms@mac.com>, Brian Manley <brian.manley@gmail.com>, Semantic Web at W3C <semantic-web@w3.org>
Message-Id: <1220391847.5016.33.camel@Kalb>

Hi Richard, 
if I understand it correctly, a data store is allowed to provide any
named graphs it wishes to. Could your problem be solved with a special
named graph for the merge of all the data (allowed for a user)? I mean
something like this:

SELECT * 
FROM <http://localhost/special/all>
WHERE { 
     ?s foo:bar ?baz ;
        zob:zab ?bing .
}

This would be a data-store-specific extension, but it would work with
the standard SPARQL query lang. Actually, if the query engine accesses
named graphs from a Web server, the "union of all allowed graphs" could
be just another resource on that server, and the query engine would need
no extensions.

Of course this does not preclude a better syntax for the same
functionality in SPARQL 2.0. 8-)

What do you think?
Best regards,
Jacek Kopecky


On Tue, 2008-09-02 at 14:15 -0700, Richard Newman wrote:
> One issue I have encountered in the past is that a query like
> 
>    SELECT * {
>      GRAPH ?g {
>        ?s foo:bar ?baz ;
>           zob:zab ?bing .
>      }
>      FILTER (allowed(?g))
>    }
> 
> will only return answers where *both* triple patterns match in the  
> same permitted graph. The user's intent is "match these two triple  
> patterns in the union of triples from allowed graphs", but the query  
> actually means "for each allowed graph, try to match these two triple  
> patterns". Fewer results are returned than they expect.
> 
> If your data is spread across multiple graphs that the user can see --  
> e.g., some of their triples are private and some public -- then you  
> hit this problem.
> 
> This limitation results in ugly workarounds such as
> 
> 
>    SELECT * {
>      GRAPH ?g1 {
>        ?s foo:bar ?baz
>      }
>      GRAPH ?g2 {
>           zob:zab ?bing .
>      }
>      FILTER (allowed(?g1) && allowed(?g2))
>    }
> 
> GRAPH is the wrong construct to use for this sort of query. Probably  
> the right solution is to include the access control information in the  
> dataset construction ("FROM = every graph the user can see"):
> 
>    SELECT *
>    FROM <allowed-1>
>    FROM <allowed-2>
>    ...
>    WHERE {
>      ?s foo:bar ?baz ;
>         zob:zab ?bing .
>    }
> 
> but that means the query is specific to the user (or you have to use  
> out-of-band dataset selection).
> 
> Perhaps SPARQL 2.0 will have some construct that allows filtering the  
> dataset within the query, or otherwise address this issue. Individual  
> implementations, of course, can provide access control through other  
> means.
> 
> A couple of years ago I was working on a system that very heavily used  
> very complex access control. My ultimate conclusion was that standard  
> SPARQL was not very well suited to this kind of thing. That's an  
> interesting conclusion for a SPARQL implementor to draw, but there you  
> are :)
> 
> -R
>

Received on Tuesday, 2 September 2008 21:44:51 UTC