Re: SPARQL Security - Best Practices? from Paul Gearon on 2008-09-03 (semantic-web@w3.org from September 2008)

From: Paul Gearon <gearon@ieee.org>
Date: Wed, 3 Sep 2008 17:04:21 -0500
To: "Semantic Web at W3C" <semantic-web@w3.org>
Message-ID: <a25ac1f0809031504w4949ec33u6df2d91325e8636e@mail.gmail.com>

On Tue, Sep 2, 2008 at 9:41 PM, Richard Newman <rnewman@twinql.com> wrote:
> If you think about it, these two things are the same: modifying the data
> directly to include access control information. (Not quite the same as
> adding metadata to the graph, which is an annotation.)
>
> The results are also the same: you have to alter your query correspondingly.
>
> The difference comes in the complexity of the alteration. SQL is happy to
> oblige when you add another column; just one more WHERE clause.
>
> If you store your addition explicitly in RDF, your SPARQL queries get much
> more complex, because there is no provision for introducing new abstractions
> (and you have choices to make: when you have a triple between two resources,
> which resource do you look at for access control -- s, o, or both?).
>
> If you store your new info "behind the scenes" -- extending your
> implementation -- your queries remain simple, but SPARQL doesn't know the
> new info exists... so you need another channel for communicating your access
> control credentials, and you can't write SPARQL queries that interrogate
> those new statements.

in 2002 I implemented access control for Tucana using something like
what is being discussed here. However, I was using TQL, which provided
some extra functionality over SPARQL, since the DAWG hadn't been
formed at that point.

In our case, we had readAccess and writeAccess predicates (among
others) which were used to assign graph access permissions for user
IDs (which were represented as URIs). Select queries then became
intersections between graphs that the user had read access to and the
graphs they had asked for. These access control statements were stored
in the "security graph", which was controlled in a similar way to the
other graphs (thereby allowing an admin to have access to it). This
required the database to be brought up in an "admin" mode in order to
write data to the original security graph. Once that was done, the
database could be brought up normally.

Graph intersections are part of the TKS/Kowari/Mulgara algebra, but
they are implemented by distributing the graph expression out to each
BGP in the WHERE clause, so it's feasible to do something similar in
SPARQL. The main trick is to make sure that explicit GRAPH expression
are not used to bypass any intersections you try to create in this
way.

I'd love to point people at this code (well, except for the
embarrassment factor of people pointing out my mistakes), but it was
kept proprietary, and didn't make it into the Open Source projects of
Kowari or Mulgara. I've talked about re-implementing this for Mulgara,
but have been advised against it, in case the current owners of my
original code get upset. However, I should note that it's *possible*
this code will be released to the wild in the foreseeable future, so
maybe I can point to it if that happens.

Regards,
Paul Gearon

Received on Wednesday, 3 September 2008 22:04:58 UTC