Re: HTTP Extensions for Simultaneous Download from Multiple Mirrors

On Mon, Jul 27, 2009 at 11:03 AM, Henrik
Nordstrom<henrik@henriknordstrom.net> wrote:
> This draft made a bit of surprise appearance in the transport area
> meeting today:
>
> http://tools.ietf.org/html/draft-ford-http-multi-server
>
> My initial reaction is lots of obvious overlap with other work and
> misunderstandings of basic HTTP functions like ETag.
>
> Basic motivation behind the work may be reasonable however.
>
> I will try to catch the author for a more in-depth discussion shortly.
>
> Other opinions?

Very interesting, thanks for writing about this Henrik. I hadn't seen
or heard of it.

For those unfamiliar with Metalink, we offer solutions to the same
problems (and more) in an XML format, as opposed to HTTP extensions.

So I'm interested in what people think about it, criticism, ideas, etc
because it may allow us to improve what we are doing. We're also
seeking review for our Internet Draft at
http://tools.ietf.org/html/draft-bryan-metalink

If anyone is interested in trying Metalink out, a good amount of
software is available, in the form of download managers (most popular
ones), Firefox extensions, command line clients, and browser.
While many Metalink clients, especially download managers, download
simultaneously from multiple mirrors, it's really about giving
alternate locations for a download to complete (if a server goes down)
and also repairing downloads. Information about mirrors like location
and priority can also be included.

Projects like cURL, OpenOffice.org, and most Linux distributions use
Metalinks for downloads, especially for large files.

More info here:

http://www.metalinker.org/
http://en.wikipedia.org/wiki/Metalink

Here's the intro from draft-ford-http-multi-server-00:

"1. Introduction and Motivation


   Mirrored HTTP servers are regularly used for software downloads,
   whereby copies of data to be downloaded are duplicated on many
   servers distributed around the Internet.  Users are encouraged to
   manually choose a nearby mirror from which to download.  This is
   intended to increase both throughput and resilience, and reduce load
   on individual servers.  Manual mirror choice rarely works well; users
   do not wish to make a choice, but if they are not forced to, then the
   default server takes a disproportionate share of the load.  Even when
   they are forced to choose, they rarely have enough information to
   choose the server that will provide the best performance.

   Some popular sites automate this process using DNS load balancing,
   both to approximately balance load between servers, and to direct
   clients to nearby servers with the hope that this improves
   throughput.  Indeed, DNS load balancing can balance long-term server
   load fairly effectively, but it is less effective at delivering the
   best throughput to users when the bottleneck is not the server but
   the network.

   This document specifies an alternative mechanism by which the benefit
   of mirrors can be automatically and more efficiently realised.  These
   benefits are achieved using a number of extensions to HTTP which
   allow the discovery of mirrors, the verification of the integrity of
   files on each mirror, and the simultaneous downloading of chunks from
   multiple mirrors.  The use of this mechanism allows greater
   efficiency in resource utilisation in the Internet as a whole,
   balances server utilization, even on short timescales, and enhances
   user experience through faster downloads."


-- 
(( Anthony Bryan ... Metalink [ http://www.metalinker.org ]
  )) Easier, More Reliable, Self Healing Downloads

Received on Thursday, 30 July 2009 22:29:42 UTC