butlers and robots and access control from Henry Story on 2011-04-09 (public-xg-webid@w3.org from April 2011)

From: Henry Story <henry.story@bblfish.net>
Date: Sat, 9 Apr 2011 18:42:31 +0200
To: WebID XG <public-xg-webid@w3.org>
Message-Id: <9A92EC3B-AA03-47F6-86A5-11079691922E@bblfish.net>
I have been  working on  Clerezza (zz) which can now host very basic and uninteresting profiles and allow users to create WebIDs. I started to work on the ability for the server to fetch remote resources *for* its users, ie. possibly protected ones. Perhaps I started this a bit too early. But here is an outline of what I discovered while working on this.

The most obvious way to proceed is for a zz server to create an additional Public/Private key pair for each of its users and build SSL client connections with that when it fetches information that may be under access control. Any server zz connects to can return representations just by considering the WebID in the request and if they should or should not have access.

But, it is more complex for Clerezza (and so for any such agent) because it may get completely different representations back depending on which agent it is serving.  To make this more vivid, let me develop an example. Let us assume she.net is hosting Alice and Anna's content and he.org is hosting Bob content. she.net and he.org are new types of agents, call them robots, as opposed to clients or servers, because they can be both.

Perhaps we can defined that new class of Agents as

:Robot owl:subClassOf foaf:Agent .

Robots have admins, and they are usually instances of something related to a doap:Project . I won't bother defining those relation here right now.

Let's just name our two robots in n3

@prefix he: <https://she.net/>
@prefix she: <https://he.org/>

he:r1 a :Robot .
  :admin Bob .

she:d1 a :Robot;
   :admin :jane .

We can also name a few humans

she:alice a foaf:Person.
she:anna a foaf:Person.

he:bob a foaf:Person.

So when she:d1 connects to he:r1 as she:alice requesting resource he:agda, he:r1 will return graph g1;
                                    she:anna  requesting resource he:agda, he:r1 will return graph g2 .

(somehow I can see this being very useful in Italy)

If he:r1 places those graphs at the same named location she:agda, he will overwrite the previous graph, and this could lead to quite a lot of problems for Bob (the human). So ZZ will need to keep the graphs formed from the representations seen by each of its users separate.

This issue does not arise of course if the server only hosts the content of one agent, which is why building a FreedomBox with one user per box  would be so much simpler. (note: the same goes for cell phones)

If one did not want to immediately deal with this issue of keeping different views separate one could have the server she:d1 connect as herself. She would then just need to add a foaf:knows relation to each the accounts she was serving.  Well foaf:knows would not work because it is defined as relating two people, and neither she:d1 nor he:r1 are people. So perhaps a :employes relation could be coined

she:alice :employs she:d1 .
she:anna :employs she:d1 .
he:bob :employs he:r1 .

Now when she:d1 fetches information for :anna she will at least be granted more access than an anonymous agent ( because she:d1 will be known by he:r1 as being related to anna and alice in the :employs relation ). But he:r1 won't know of course if at that point she is serving alice or anna. So he:r1 will have to send only information that both would want to know. He won't be able to treat she:r1 as equivalent to either anna or alice, which is good, but of course would be problematic for very large servers with millions of people on it, since  share information with everyone in such a large entity is equivalent to making it public. Such an entity would in effect if it identified itself as one of the robots, always be treated as anonymous by everybody else. The larger the number of people that robot served the more anonymous it would be treated as.

So there is really not much of a way around this. Any service hosting multiple users will when fetching information for a user need to authenticate as that user and keep graphs destined for each user separated. 

At the same time it is clear that these large services are themselves able to see into each of the graphs, even if they keep these separate. This has come up a lot on this list: it is not inconsequential who the server one is placing one's information on is owned by. Is it a large company? Then individuals speaking there are speaking as employees. Is it a university? Then you may have trust in some of what is being said. Is it a government id? Don't make jokes about blowing up airports. Is it a Freedom box? Then the owner of the box is the only one to see what appears there - and he could say anything at all.

It would seem like a good idea if one could tie in the serving agent into the loop, so that agents on which requests were made could take these facts into account. Luckily X509 comes with an Issuer Alternative Name field in which one can also place a WebID. And this WebID can describe the owner of the service. Of course DNS also give some information as to who the owner of the service is, but a WebID allows for a lot more complete information to be served. 

If we are to put a IAN in the cert, then we can not have the certificates served to end users by the site be self signed, or there will be an immediate owl:sameAs relation between all agents on the site. They will have to be signed by a certificate representing the service itself. 

(Here there is an interesting option that comes to mind and mentioned by Peter Williams a year or so ago, that one could use the public key published by that TLS service to create a new signing certificate. Of course these won't be verifiable using usual public key crypto from the root CA on down, since those CA's  put restrictions in the EE certs as to how far their responsibility for certification goes: they will certify the service, not what the service decides to certify. We should try to work out the details of the pros and cons of this)

But with such a service certificate available, perhaps the relation of authority becomes clearer too. The service profile could publish a POWDER document specifying what resources on the server it certifies directly, and perhaps which ones, it just guarantees bit by bit to serve as the owner of that subset of resources asked it to do. In any case it can also describe the ownership of the service: is it a university, a government, a bank, a church, a freedombox....

There may be some interesting inferences one can draw too here: namely that if the person connecting to a service uses the exact same public key as the one in the certificte served by that service then he is the owner of that service. A way to distinguish Freedom Box owners form others cryptographically perhaps.

	Henry

Ps. Butlers are agents that know how to keep complex information between the members of the household seperate. But anyone talking to a butler should know who he works for.

Social Web Architect
http://bblfish.net/
Received on Saturday, 9 April 2011 16:43:06 UTC