- From: Andrew de Andrade <andrew@deandrade.com.br>
- Date: Fri, 22 Jan 2010 00:33:10 -0200
comments inline On Thu, Jan 21, 2010 at 8:24 PM, Mike Hearn <mike at plan99.net> wrote: > WebSockets doesn't let you open arbitrary ports and listen on them, > so, I don't think it can be used for what you want. that's my understanding. My question to this is if it is possible to open arbitrary ports and listen in on them if the spec was changed. If so, what would be the chance of making a change like this this late in the game. What would be the advantages and disadvantages of including this at the HTML5 spec level instead of the browser level. At some level I feel like this should have some sort of implementation at the spec level for security reasons. You wouldn't want the browser to simple share all the static content from anywhere it's visited. Instead you'd want only content that is static to be "on the DMZ" so to speak. Also, how would you go about making this work since ultimately you still have some sort of implementation at the originating web server, web application level. It appears to me that simply implementing this at the browser level would not be enough. You'd need appropriate functionality on the originating web server side simply to provide instructions to the browsers. i.e. "here's the torrent file for this web application, in it you (the browser agent) will find a list of files available in distributed form from your peers and the hash table to verify authenticity." It seems to me that this idea would also complement the idea of the manifest for offline browsing in HTML5. The torrent file would simply be defining a manifest of files that are to be fetched from peers instead of the originating server. The browser would also need a central location to register itself, so that peers can discover other peers. > > P2P in general is a lot more complicated than it sounds. It sort of > works for things like large movies and programs because they aren't > latency sensitive and chunk ordering doesn't matter (you can wait till > the end then reassemble). On the other hand, when the swarm is large enough this isn't that big of a problem and this is an application where everyone is a seeder while they are using the application. It would be idea if you wouldn't be able to not seed and there was no such thing as a ratio and the browser would throttle throughput automatically. In addition to this, the application can also specify a static location for the static content in case the client times out trying to get the file from a peer. i.e. "Hey browser, download these files from one of these peers. However if you don't succeed within 1500 ms, feel free to contact me again and I'll send it to you" > > But it has problems: > > - A naive P2P implementation won't provide good throughput or latency > because you might end up downloading files from a mobile phone on the > other side of the world rather than a high performance CDN node inside > your local ISP. That sucks for users and also sucks for your ISP who > will probably find their transit links suddenly saturated and their > nice cheap peering links with content providers sitting idle. Any ideas on how this could be resolved? I figure if the application is popular enough, the peers could be geographically tagged using the "GPS" functionality of HTML5. Clients would automatically get better connections and throughput and preference if they choose to make their location available to the "torrent" server so that peers can look up peers nearby. > > - That means unless you want to have your system throttled (or in > companies/universities, possibly banned) you need to respect network > topology and have an intimate understanding of how it works. For > example the YouTube/Akamai serving systems have this intelligence but > whatever implementation you come up with won't. Truth is that network connectivity is getting better and better. As aggregate and average network speeds increase around the world, this problem will be overcome. As Eric Schmidt said back in 93 when he was at Sun Microsystems: "When the network becomes as fast as the processor, the computer hollows out and spreads across the network." I wouldn't be surprised if principles from Erlang could be borrowed and applied to this idea, because from what I understand, Erlang has an extremely robust agent-based message passing model with lots of error checking and recovery features. > > - P2P is far more complicated than an HTTP download. I never use > BitTorrent because it basically never worked well for me, compared to > a regular file download. You don't see it used much outside the pirate > scene and distributing linux ISOs these days for that reason, I think. As a product manager myself, I don't think you see it outside those two scenes because it doesn't hide enough of the complexity for mainstream use. Compare the following two: A) Step 1: Point your browser at www.myapplication.com B) Step 1: Point your browser at www.torrentclient.com Step 2: Download and install torrent client Step 3: Configure torrent client to work well with your network if you are behind a NAT Step 4: Find a torrent file repository Step 5: Find and download the torrent file you want and add it to your torrent client if it does not already do so automatically, Other than step 3, all the rest of those steps can be hidden from the user with the idea I suggested. Using a P2P distritbuted application suddenly becomes as easy as typing in a URL and clicking enter. An as far as I can tell Step 3 can be resolved by IPv6. > > Your friends problem has other possible solutions: > > 1) Harvesting low hanging fruit: > 1a) Making sure every static asset is indefinitely cacheable (use > those ISP proxy caches!) > 1b) Ensuring content is being compressed as effectively as possible > 1c) Consider serving off a CDN like Akamai or Limelight. There is > apparently a price war going on right now. I'm going to check if we've done 1a. AFAIK, we've performed 1b and we already use a CDN to distribute the content (1c). We've also made sure to set the etags and the last modified HTTP headers. > > and of course the ultimate long term solution > > 2) Scaling revenues in line with traffic > The reason I suggested the idea was to get away from cost entirely. The more popular something is the cheaper it is to distribute it. Bandwidth costs effectively drop with popularity. An inverse price relationship and negative marginal cost. Imagine the impact on the economics of information dissemination. This would turn things on its head. One dangerous side to this is that you would promote natural monopolies (i.e. a stock market) in which liquidity attracts more liquidity. It would all of a sudden be costly (relatively speaking to get an aggregation site off the ground, because the massive players would have hyper-economies of scale. i.e. if their videosharing site is 25x as popular, their cost would not be 25x the cost, but instead may be only 4x the bandwidth cost because all the peers on their network would be providing the necessary bandwidth in swarm form. on the other hand. there all of a sudden is an incredible amount of value in trying to produce quality content (you'd have a reason for the cult of the amateur to become the cult of the individual pro). okay, enough rambling.
Received on Thursday, 21 January 2010 18:33:10 UTC