Re: Is a Perfect Storm Forming For Distributed Social Networking? from Dave Raggett on 2009-08-12 (public-xg-socialweb@w3.org from August 2009)

From: Dave Raggett <dsr@w3.org>
Date: Wed, 12 Aug 2009 13:30:19 +0100 (BST)
To: Tim Anglade <tim.anglade@af83.com>
cc: Story Henry <henry.story@bblfish.net>, Melvin Carvalho <melvincarvalho@gmail.com>, public-xg-socialweb@w3.org
Message-ID: <alpine.DEB.2.00.0908121248590.5823@ivy>

On Wed, 12 Aug 2009, Tim Anglade wrote:

>> On Wed, 12 Aug 2009, Story Henry wrote:
>> 
>>> On 12 Aug 2009, at 10:46, Dave Raggett wrote:
>> 
>>>> One of the challenges for distributed social networking is 
>>>> dealing with sudden hotspots where a huge spike of interest in 
>>>> a single person causes the server that hosts that person's 
>>>> profile to falter under the load.
>
> Wait, what? Actually that scenario is almost a non-issue since a 
> single profile or asset (video, etc.) can be cached pretty 
> efficiently. There are algorithms to detect what is becoming 
> trendy very fast and replicate many copies where it's famous 
> before it becomes and issue (Akamai bought its first two private 
> islands from the IP of this kind of algorithm). On the open-source 
> side, one can very easily implement his own, simple algorithm to 
> push popular pages into a few memcached thrown here and there, 
> dynamically.

That assumes there is a caching proxy somewhere. Akamai has a 
business model based around contracts with big corporate websites. 
What about small personal ones? If the SNS is accessed via an app on 
your personal server, then yes I agree that a server-side module 
could use a cloud-based proxy over directly requesting a resource 
from someone else's personal server, however, what is the business 
model for that proxy? Who pays for it?

> The real technical burdens today in Social Networking come from 
> assets or information that can't be cached too easily without 
> missing users' expectations. (Aggregated) Activity Feeds (such as 
> the one presented on your homepage on Facebook not your profile's) 
> for example, are very very hard to handle properly.
>
> (They are even harder to handle in a distributed or peer-to-peer 
> context, if you were about to ask.)

That can be addressed by distributing the computation of such 
assets, through the definition of distributed services.

>>>> This suggests the value for applying peer to peer techniques to 
>>>> dynamically distributing the load across many machines. Peer to 
>>>> peer techniques can also help to sustain performance for search 
>>>> by distributing the processing across many machines.
>
> Well, I think you really mean distributed here instead of 
> peer-to-peer. There's no distinct advantage peer-to-peer has over 
> plain distributed for this scenario. Actually, being able to 
> enforce a minimal amount of control (availability of nodes, forced 
> instantiation of new nodes in extreme contexts) gives 
> “corporate-distributed” a small advantage here over peer-to-peer.

I am thinking about P2P in the sense of collaborating personal web 
servers and distributed services, rather traditional P2P in the 
purest sense. If most people can have dynamically provisioned 
personal servers, then so much the better!

Thanks for your exposition of QoS issues for some existing P2P 
systems I agree that this remains an issue for real-time performance 
of streaming media, due e.g. to the limited redundancy of Internet 
routing and flash points where traffic exceeds the capacity. As a 
case in point, Internet access from my home is good for the most 
part, but every now and then suffers from brown outs and DSL 
disconnections.

Content wouldn't disappear as long it is on the original server. Of 
more interest is when a user wants to delete some resource she owns. 
A well defined distributed service would make this easier to fulfill 
than conventional HTTP caches which hold on to the resource until it 
reaches it expiry period. I wouldn't get too hung up on existing 
P2P. We should focus on identifying and addressing the needs for an 
open social web, and adapting the technologies rather than living 
with their limitations.

I take your point about the difficulty of general problems like 
multi-point, heterogenous data synchronization, but the problems are 
often quite practical to solve if you narrow them down to what is 
actually needed for a specific context. By defining the open social 
web as a set of specific services, we can then find ways to make 
them scale through distribution. It's all doable in an open 
collaboration.

  Dave Raggett <dsr@w3.org> http://www.w3.org/People/Raggett

Received on Wednesday, 12 August 2009 12:30:29 UTC