Re: wikileaks - Web Architecture and Robustness from Robin Berjon on 2010-12-10 (www-tag@w3.org from December 2010)

From: Robin Berjon <robin@berjon.com>
Date: Fri, 10 Dec 2010 13:07:44 +0100
To: Tim Berners-Lee <timbl@w3.org>
Cc: Melvin Carvalho <melvincarvalho@gmail.com>, Karl Dubost <karld@opera.com>, "www-tag@w3.org WG" <www-tag@w3.org>
Message-Id: <F8840B4E-7F48-4F73-AEC2-00D59E9CF0E8@berjon.com>

On Dec 6, 2010, at 17:05 , Tim Berners-Lee wrote:
> On 2010-12 -06, at 09:21, Robin Berjon wrote:
>>> There are 200 'mirrors' now listed, and counting.
>> 
>> Only because this is a high profile case with a large sympathetic community. If similar censorship methods had been levelled at a smaller, less popular cause that isn't a press and Twitter darling, it would likely be offline by now (or at the very least see its operation much more seriously affected).
> 
> So if (say) those who point to a page have a random tendency to cache it just in case,
> they should coordinate so that hose who point to the less popular sites
> should increase their chance of being a mirror in order to make sure that everything 
> will end up being mirrored somewhere -- and you can find it automatically by following the backlink?
> "Mutual Aid"

Essentially, an automatic version of this process is what I would see HTTP over P2P do. If it's possible to stream full screen video in a quasi real time using this infrastructure, we should be able to do it for generic web content.

>> WikiLeaks is also simpler because it's static content — you can mirror it with a single wget command. With a more elaborate service requiring complex setup, or the synching of a DB, it would be far more problematic. In other words, we shouldn't take WikiLeaks' resilience as a general indication.
> 
> Of course standards help.  Linked data can be mirrored of course just like HTML.
> 
> A Sparql service is weel-defined, a mirror can get a copy of the data in
> a standard transfer format, stick it in their favorite triple store, and turn on SPARQL.
> But it isn't automatic.

Not only is it not automatic, but it only works for static data. If for instance you had a site that's tracking up to date exactions being committed during a country-wide genocide, that is proving very helpful in directing aid, journalists, international observers, etc. If that service is shut down by people who don't want such nosiness, mirroring the old data won't help much since the value is largely in its freshness.

This seems to point to further bricks that might be needed. We need P2P DNS so that addressing can be resilient, and P2P caching/distribution to be able to recover data and face DDoS. For dynamic services, something more is needed. There's a lot of talk about the "cloud" but at this stage we have no such thing. Some companies offer "cloud services" but that's just renting someone's really big computer. It scale nicely, but there's nothing cloudish about it. A genuine cloud architecture would be fully distributed across the network (perhaps in a fashion similar to what Tahoe-LAFS http://tahoe-lafs.org/trac/tahoe-lafs does). I'm guessing that a better integration of P2P into the Web alongside standard micropayments might get us there.

-- 
Robin Berjon - http://berjon.com/

Received on Friday, 10 December 2010 12:08:14 UTC