RDFAuth_Sketch: what are we trying to solve

Dan Brickley asked the following question:
 > I'd like to see a 1 page "here's the problem we're trying to solve"  
that doesn't itself specify any protocol design.
 > Borrowing from Stefan Decker's page at http://www.stefandecker.org/the-heilmeier-catechism.html 
  quoting George Heilmeier:

I had partly answered some of these in the blog post on (the  
controversially named) RDFAuth [1]
But let me organize them as suggested by Dan:

  1. What is the problem, why is it hard?
  ---------------------------------------

The problem can be stated either in general terms as a result of  
requirements on Linked Data or by giving a specific example of a  
Linked Data application. Let me start with a concrete example: Social  
Networks.

In order to break out of the current Social Networking (SN) data  
silos, where each SN is a world into itself and users cannot link to  
users in other SN, we require Linked Data. This can be built using  
vocabularies such as FOAF in an open manner and one can use these to  
build Social Network browsers such as Beatnik [2] and servers such as  
Knowee [3]. Social Networks users though demand some levels of  
privacy. They want some information to be protected, available only to  
subgroups of people, to be determined by them. For example I may only  
wish to allow friends of my friends to have access to my network of  
friends. Strangers may only get my basic business information. Or you  
may wish only members of your (extended) family to have access to your  
family tree.

   In order to make privacy controls possible in a Distributed  
Network, Resource Servers needs a way to identify the User Agent  
Owner. This has to itself be done in a distributed ( non centralized )  
way if we are not to create another bottleneck or control point.

   A Distributed Social Network such as the one here described will be  
very decentralized. Every one of Tim Berner's Lee's acquaintances, as  
described in his foaf file, has a URL on a different domain. If when  
browsing the network of Tim's acquaintances a User Agent had to log  
into each service with a new password the software would be of no  
interest to anyone. So the protocol cannot assume that the initial  
authentication cost can be large because it can then be recovered over  
a long session. In fact with protected resources the authentication  
cost has to be very low, because it may be difficult for a User Agent  
to know in advance if it has access to a resource or not.

   The Resource Server may not know the User Agent Owner by name, but  
may wish to determine whether to allow the User Agent access by  
understanding the owner's relation to other people in a Social  
Network. It must be possible for the protocol to find a flexible  
description of the User Agent Owner.

   If privacy controls are to be important then one needs to think a  
little wider. It is not just that the Resource Server wishes to  
protect information about the Resource Server Owner. The User Agent  
Owner may also not wish everyone to know what software he is using or  
whom he is asking  protected information about.

  2. How is it solved today?
  --------------------------

  I don't think this has been solved today.

  The closest protocol I know that attempts to create a single sign on  
is OpenId. It makes it possible for a person to have one single global  
identifier and log onto any service on the web. The problems of this  
protocol are ( as described in [1] )
  - the cost of authentication is very high. It was designed with the  
limitations of current web browsers in mind. As a result an  
authentication request requires the browser owner to log in with his  
openid.
  - The information about the User Agent available to the Web Service  
is quite limited. It can only be property value pairs. Neither is it  
easily extensible.
  - It does not work well with Semantic Web standards
  - It does not fit well into web architecture (REST)
  - the authentication server, and attribute server are points of  
control. The owner of these will know what services the User is  
logging into. Though one can deploy one's own attribute server this is  
not easy at all. (Many services seem to only accept ids with specific  
authentication servers)

   Another protocol, oAuth that has some relevance, requires services  
to agree before hand on how authorization can work on a case by case  
basis. This is not realistic in an open distributed social network.  
Things have to be much more flexible than this.


  3. What is the new technical idea; why can we succeed now?
  ----------------------------------------------------------

   As with OpenId we use a URL to identify a Person (or more generally  
any Agent) globally. But instead of requiring an Identity server we  
use PGP asymmetric key cryptography to identify the User Agent Owner.  
This is similar to the mutual authentication using client certificates  
of https, except that we link the client certificates into a Web Of  
Trust tied together with Linked Data and we publish the public key at  
a URL accessible via the User Id. The solution is RESTful and can make  
use of the Network Effect of Linked Data. We build on well established  
standards: HTTP for the protocol, REST for the architeture, URI for  
the naming, RDF for the semantics, PGP for encryption.

Why can it succeed now? Semantic Web tools have now grown to be of  
good enough quality to develop this in pretty much every language and  
on any platform. The Social Networking Data Silo problem is very real,  
and will soon be felt by millions of people [4]. These new  
applications don't need to work around web browser limitations either.  
These applications can be built from scratch, and so they can develop  
the protocols that are needed to solve this problem.


  4. What is the impact if successful?
  ------------------------------------

   We have the first hyperdata applications for the masses: open  
distributed social networks browsers and servers.

  5. How will the program be organized?
  -------------------------------------

   It has to be open, patent free and open sourceable. The details  
have to be determined.

  6. How will intermediate results be generated?
  ----------------------------------------------

   Beatnik, Knowee, Tabulator, openqabal can be used to test the  
protocol.

  7. How will you measure progress?
  ---------------------------------

    By 100s of thousands of users joining the network.

  8. What will it cost?
  ---------------------

    To be determined by people with some experience in this.


	Henry Story


NOTES
=====

[1] http://blogs.sun.com/bblfish/entry/rdfauth_sketch_of_a_buzzword
[2] https://sommer.dev.java.net/ search the page for Beatnik
[3] http://knowee.org/
[4] http://blogs.sun.com/bblfish/entry/2008_the_rise_of_linked
     but also the Economist article
     http://www.economist.com/business/displaystory.cfm? 
story_id=10880936

Received on Tuesday, 1 April 2008 11:17:48 UTC