HTTP should be able to transfer part of a document

------- Forwarded Message Follows -------
Date:          Wed, 8 Mar 1995 10:54:10 +0100
From:          MAILER-DAEMON@ms.mff.cuni.cz (Mail Delivery Subsystem)
Subject:       Returned mail:  Host unknown (Name server: cuckoo.hp1.hp.com: host not found)
To:            <DINGLE@ksvi.mff.cuni.cz>

The original message was received at Wed, 8 Mar 1995 10:54:08 +0100
from ksvi [194.50.17.197]

   ----- The following addresses had delivery problems -----
<http_wg@cuckoo.hp1.hp.com>  (unrecoverable error)

   ----- Transcript of session follows -----
501 <http_wg@cuckoo.hp1.hp.com>...  550 Host unknown (Name server: cuckoo.hp1.hp.com: host not found)

   ----- Original message follows -----
Return-Path: <DINGLE@ksvi.mff.cuni.cz>
Received: from KSVI/MAILQUEUE by ksvi.mff.cuni.cz (Mercury 1.21);
    8 Mar 95 10:58:35 +0100 (MET)
Received: from MAILQUEUE by KSVI (Mercury 1.21); 8 Mar 95 10:58:30 +0100 (MET)
In HTTP 1.0 there seems to be no way to retrieve a given part of a document, e.g. bytes
1500000 through 1600000 of a long binary file.  This seems to be an important 
limitation, but I haven't seen any discussion of adding the capability for partial 
document retrieval to the next version of HTTP, so I thought I'd bring it up 
here.

Partial document retrieval is important for at least two reasons:

1) If a long transfer is interrupted, it's possible to resume the transfer
without beginning all over again.  (This was a major motivation for the FSP
protocol, which allows partial transfers; FTP does not.)

2) The client may be interested in only part of the contents of a document. 
For example, consider using HTTP to retrieve a .tar or .zip file.  With partial
retrieval, the client program could retrieve only the index at the beginning of
the file, and then only the archive files which the user was interested in. 
Today, there are thousands of useful archive files available on the Web; often
I just want to read a README.TXT file within such an archive to see if it is of
interest, but must download the entire archive.  The limitation of HTTP under
consideration prevents me from writing a smart Web browser which can look at
files inside tar and zip archives.

One possibility would be to address partial HTTP documents using URI 
fragment identifiers, so that, for example, a hypertext page could refer to an individual 
file within a tar archive, or to a group of several paragraphs within the text of a 
book.  For example, we might use a syntax such as

http://www.cuni.cz/a/b/xxx#(1500000,1600000)

to mean bytes 1500000 through 1600000 of the given document.

Of course, such number-based addressing is somewhat dangerous to use in a
static URL, because if the addressed document changes (e.g. a new file is
inserted into the tar archive) then the range could reference meaningless or
garbage data.

As an alternative, we might include the byte range not in a HTTP URL, but 
rather as part of a request header or part of a HTTP GET request.  This would 
imply that such addressing is not really appropriate for a URL, but is a 
sort of meta-capability which a smart client might use to implement features 
such as (1) and (2), above.

I would be very interested to hear any comments about this.

Adam Dingle

Received on Wednesday, 8 March 1995 02:23:10 UTC