My only comment, which springs to mind when I see wording about
downloading files in parts.
Proxies that scan for malware.
In order to allow scanning, the entire entity is required. Therefore
range requests must be modified by the proxy.
Typically this is done by stripping the Range header in the upstream
request, and possibly stripping Accept-Ranges from server responses.
So, any system proposed that is going to use Range requests, is going
to run into problems with proxies that perform AV functions.
Even if the proxy still sends just the requested parts to the client
after downloading the whole thing and scanning it, which may satisfy
the client in terms of response, but will not satisfy it in terms of
responsiveness, retrieving parts from multiple servers will have a
hugely negative effect, since the file will cause the download the
entire file from each server that it thinks it's only getting a part of
the file from.
This is therefore only ever going to become an increasing problem,
since network administrators aren't going to be abandoning antivirus
any time soon.
I don't know what the best solution to this is. Unless the proxy knew
that the parts were of the same file, and could piece it together
itself to scan it, it can't save that resource use.
And there's no way around getting the whole file first if you want to
use heuristic scanning for malware at the gateway. This creates other
problems which I attempted to address in an Internet Draft a while back
Anthony Bryan wrote:
On Mon, Jul 27, 2009 at 11:03 AM, Henrik
This draft made a bit of surprise appearance in the transport area
My initial reaction is lots of obvious overlap with other work and
misunderstandings of basic HTTP functions like ETag.
Basic motivation behind the work may be reasonable however.
I will try to catch the author for a more in-depth discussion shortly.
Very interesting, thanks for writing about this Henrik. I hadn't seen
or heard of it.
For those unfamiliar with Metalink, we offer solutions to the same
problems (and more) in an XML format, as opposed to HTTP extensions.
So I'm interested in what people think about it, criticism, ideas, etc
because it may allow us to improve what we are doing. We're also
seeking review for our Internet Draft at
If anyone is interested in trying Metalink out, a good amount of
software is available, in the form of download managers (most popular
ones), Firefox extensions, command line clients, and browser.
While many Metalink clients, especially download managers, download
simultaneously from multiple mirrors, it's really about giving
alternate locations for a download to complete (if a server goes down)
and also repairing downloads. Information about mirrors like location
and priority can also be included.
Projects like cURL, OpenOffice.org, and most Linux distributions use
Metalinks for downloads, especially for large files.
More info here:
Here's the intro from draft-ford-http-multi-server-00:
"1. Introduction and Motivation
Mirrored HTTP servers are regularly used for software downloads,
whereby copies of data to be downloaded are duplicated on many
servers distributed around the Internet. Users are encouraged to
manually choose a nearby mirror from which to download. This is
intended to increase both throughput and resilience, and reduce load
on individual servers. Manual mirror choice rarely works well; users
do not wish to make a choice, but if they are not forced to, then the
default server takes a disproportionate share of the load. Even when
they are forced to choose, they rarely have enough information to
choose the server that will provide the best performance.
Some popular sites automate this process using DNS load balancing,
both to approximately balance load between servers, and to direct
clients to nearby servers with the hope that this improves
throughput. Indeed, DNS load balancing can balance long-term server
load fairly effectively, but it is less effective at delivering the
best throughput to users when the bottleneck is not the server but
This document specifies an alternative mechanism by which the benefit
of mirrors can be automatically and more efficiently realised. These
benefits are achieved using a number of extensions to HTTP which
allow the discovery of mirrors, the verification of the integrity of
files on each mirror, and the simultaneous downloading of chunks from
multiple mirrors. The use of this mechanism allows greater
efficiency in resource utilisation in the Internet as a whole,
balances server utilization, even on short timescales, and enhances
user experience through faster downloads."
Adrien de Croy - WinGate Proxy Server - http://www.wingate.com