Re: HTTP should be able to transfer part of a document from Owen Rees on 1995-03-09 (ietf-http-wg@w3.org from January to March 1995)

From: Owen Rees <rtor@ansa.co.uk>
Date: Thu, 09 Mar 1995 14:20:49 +0000
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9503091420.AA23277@plato.ansa.co.uk>

"Adam Dingle" <DINGLE@ksvi.mff.cuni.cz> writes:
> Partial document retrieval is important for at least two reasons:
> 
> 1) If a long transfer is interrupted, it's possible to resume the transfer
> without beginning all over again.  (This was a major motivation for the FSP
> protocol, which allows partial transfers; FTP does not.)

According to RFC959, FTP allows restart from a checkpoint in order to cope 
with long transfers that are interrupted. According to the man page for the 
HP-UX ftp client, Unix systems typically use byte offsets as the checkpoint 
markers, and I have successfully used this facility several times, I have even 
used it to retrieve the remainder of a file where the first part was retrieved 
by HTTP, but via a proxy that timed out.

A problem with adding this sort of feature to HTTP is that dynamically created 
resources, and other forms of variable resource, are much more common for HTTP 
than for FTP, so more care is needed in ensuring that continuing is in the 
same representation of the same version of the resource, although you could 
use a different instance (i.e. an identical copy at a different location). 
Although the HTTP protocol could be extended to have a "restart from here" 
feature, the implicit assumption that we are always dealing with moderately 
static data files needs to be exposed and questioned.

The retrieval of parts of resources is more a question of the naming of the 
parts of a structured resource, and is not really an HTTP protocol issue. The 
question of what constitutes a resource, and how you name resources that are 
parts of collections is the sort of issue that crops up on the URI list, and 
an issue that does not seem to have any completely general solutions.

On the whole, I would rather encourage the development of smarter servers that 
can pick components by name out of structured resources like tar or zip 
archives in response to an appropriate URL. This could include delivering 
arbitrary ranges out of those resources for which this makes sense. Putting 
the component description into what the relative URL draft calls the 'param' 
part - i.e. after a ';' the way WN does - is a reasonable approach, but not 
the only one.

I believe that the current position is that servers can interpret the 
'url-path' in their URLs however they like within the constraints set by the 
various Accept...: headers. If the idea is to specify an interpretation for 
part of an http URL, then it would be a good idea to involve the URI group in 
the discussion since it seems to involve issues important to both groups.

Regards,
  Owen Rees <rtor@ansa.co.uk>
Information about ANSA is at <URL:http://www.ansa.co.uk/>.

Received on Thursday, 9 March 1995 06:29:47 UTC