a thought on p2p rww from Nathan on 2012-07-20 (public-rww@w3.org from July 2012)

From: Nathan <nathan@webr3.org>
Date: Fri, 20 Jul 2012 14:14:02 +0100
To: Read-Write-Web <public-rww@w3.org>
CC: oshani@csail.mit.edu, lkagal@csail.mit.edu, Melvin Carvalho <melvincarvalho@gmail.com>, Poor Richard <poor.ricardo@gmail.com>, Joe Presbrey <presbrey@csail.mit.edu>, Sandro Hawke <sandro@w3.org>, Tim Berners-Lee <timbl@w3.org>
Message-ID: <5009599A.4080502@webr3.org>
(cc-d a few people who may not be on this list)

apologies as this is a bit abstract and isn't in any way well defined.

Previously I've thought often how the web can be very like a p2p system.

While discussing PeerPoint on the next-net list I outlined:

''I will say that it's nice to consider a P2P web where DNS is replaced 
by open DHTs, and where each node on the network is both client and 
server (peer) serving and responding to HTTP requests. If you think 
about it, HTTP is a nigh on perfect async communication method for peers 
to communicate with each other, POSTing back replies when they are ready.''
then later:
''If you take a bit of paper and draw a tiny fraction of the web on it, 
a  bunch of interconnected dots, nodes and arcs, then have a good ol' 
stare at it, you'll quickly recognise every architecture you can 
conceive in there: centralization, decentralization, neighbourhoods, 
client-server relations, hubs, p2p networks - clients can only make 
requests, servers can receive requests, peers can do both.''

So, for quite a while I've been thinking of machines that are http 
client+servers as peers.

On another note, I had been thinking about the 'like' and '+1' system, 
and how each resource (say blog post) seemed to naturally want to be in 
control of that information, pretty much every blog owners concern is to 
try and get back those figures for +1 and likes, then aggregate and 
display it on the page. Similarly each person who creates a like/+1 also 
wants to be in charge of it, to give it anonymously, to give it for 
kudos, to store it to remember the liked thing, or to share the 
knowledge of it's existence. Sometimes they (the agent in charge of the 
blog/blog-post, or the agent doing the liking) want to handle this 
functionality and store this data themselves, and sometimes they way to 
delegate it to a (or several) third part(y|ies).

 From a conversation with melvin on this topic:
''
for each "like" the person may want to store it (in their, them the 
agent/resource)'s storage, they may want to notify the "liked thing" of 
the like, publicly, privately or anonymously
each resource (the agent, and the liked thing) may wish to delegate 
handling/storage of likes over to a third party
or keep it private
and 3rd part centralized aggregators, for example a IMDB or 
MediaServiceFoo may want to ask for access to a set of like in 
(anonymous way, private way, public way), very much in an oauth fashion
to me it seems that the:
a) there's a general pattern there which enables "like" management to be 
decentralized (for storing handling) and centralized where needed (for 
suggestions and aggregation)
b) that pattern can be made uniform for any kind of data
c) personal data stores may need to be agent data stores, and an agent 
can be anything which handles information related to a specific thing
d) so every "thing" on the web may be nice to model as both the thing 
and the agent that manages it
e) if done through Links then resource management can be delegated out / 
specified for each resource
''

And then I read the fantastic paper by Oshani Seneviratne and Lalana 
Kagal: "Who?, When?, How? and What?: Framework for Usage Tracking and 
Provenance of Web Resources"
http://dig.csail.mit.edu/2012/Papers/ISWC_In_Use/paper.pdf
To wet the appetite:
''Provenance Trackers are deployed as an overlay network on Planetlab 
and implemented as a Distributed Hash Table (DHT). This overlay network 
of prove-nance trackers is trusted. The trustworthiness of a provenance 
tracker node is guaranteed since entry to the PTN as a PTN node is the 
recognized by a root provenance tracker node or the Gateway. Before a 
node is added as a PTN node the Gateway asks for the new node's 
credentials and veries that with an existing PTN node. Any Web server 
can participate in the PTN as long as they know the identity of any 
other participating node, and that existing node can vouch for the new 
node's credentials.'' .. and .. ''WebID Protocol as used in the PTN'' .. 
and .. ''HTTPA Smart Client This is a browser based tool (implemented as 
a Firefox extension) that supports all HTTPA functions'' .. and .. 
''ProvenanceJS is a javascript library that extracts the provenance 
information available in metadata of resources on any given Web page''

I'd really recommend reading this paper, IMHO a must read for everybody 
in this group, regardless of their current focus.

Last night I was thinking about these three things (and many more 
related of course), and I think I realised something.

HTTP is base on the REST architectural style, an information and 
implementation hiding protocol which provides a uniform interface.
i.e. it hides whether there are 1 or 100000 servers behind the 
interface, it hides all implementation and hardware details, exposing 
only just what's needed via a protocol in order for network and web 
agents to communicate.

Thus when I'd been thinking about *machines* which are both http server 
and client as peers or nodes on a webized p2p network I was incorrect.

It would have to be *resources* which were capable of being in both the 
the server and client roll. Those things which are web accessible and 
identified by a dereferencable URI.

This potentially models each resource as being an agent which is in 
charge of the information related to and about it (as well as itself) - 
and also models each resource as a peer/node on a webized p2p network.

Further it means that each "capability" would be exposed via say Link 
headers and OPTIONS, and since they would point to resources on the web 
it means that each service or bit of functionality can be delegated to 
third parties, or retained and centralized around the resource. It can 
also of course be expressed in linked data easily, one can say that my 
list of friends is over <here>, that likes should be dumped <there>, 
that you can subscribe to updates <over-here>, and that here's the 
<provenance-data> and report usage of this resource <to-this-ptn>.

So it appears to me, that by modelling each resource as an agent (even a 
dumb one that delegates with link headers or via statements in linked 
data), then we can view the web as a decentralized p2p rww enabled 
network of interconnected nodes. One which is better than traditional 
P2P because it also caters for not only peers, but clients and servers 
too. And which encompasses every network style known to man, as all of 
them are just nodes and arcs. I guess all we need to do is ensure each 
"thing" we create is webized in a uniform way and exposed through the 
uniform interface.

No wonder the RDF/EAV model suits RWW so well, they're almost identical.

Anyway, just some thoughts.

Best,

Nathan
Received on Friday, 20 July 2012 13:14:59 UTC