Re: Comments on Byte range draft from Benjamin Franz on 1995-11-12 (ietf-http-wg@w3.org from October to December 1995)

From: Benjamin Franz <snowhare@netimages.com>
Date: Sun, 12 Nov 1995 09:59:40 -0800 (PST)
To: Chuck Shotton <cshotton@biap.com>
Cc: Gavin Nicol <gtn@ebt.com>, montulli@mozilla.com, fielding@avron.ICS.UCI.EDU, masinter@parc.xerox.com, ari@netscape.com, john@math.nwu.edu, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <Pine.LNX.3.91.951112093549.10603A-100000@ns.viet.net>

On Sun, 12 Nov 1995, Chuck Shotton wrote:

> >On Sun, 12 Nov 1995, Gavin Nicol wrote:
> 
> >> Byte ranges are a lazy replacement for a general naming mechanism.
> >
> >You still have those blinders on. The whole universe of documents is
> >not SGML/HTML/PDF/(favorite text markup language with naming mechanism).
> >The ability to restart an interrupted transfer is an item that naming
> >mechanisms are insufficiently powerful to handle in the general case.
> >Byte ranges are not a 'lazy replacement' - they are the only general
> >mechanism for restarting interrupted transfers of documents containing
> >arbitrary content.
> 
> We've had this discussion before but here we go again. Wanna see how
> screwed up byte ranges can really be? OK, here's the prime example. Suppose
> you're a HTTP client and you're half way through downloading a huge HTML
> document, the transfer terminates, and you decide to resume the transfer at
> the "appropriate" byte offset. Fine concept on the surface, but an
> implementation and performance nightmare for the following reason.
> 
> Only in the case of binary files does the byte stream transmitted by the
> server stand a chance of being identical to what what it has stored locally
> on disk. In the case of multi-fork or multi-part files, this won't be the
> case. In the case of HTML files, for example, end of line termination ruins
> the whole byte range theory. Many servers politely convert their machine
> specific EOL sequence into a normalized version (LF or CR/LF) for the
> transmission of text-only files.

This is a severely broken behavior by a server. And IRRELEVANT. Since 
byte range requests are not recognized by todays servers - you obviously 
cannot break one by insisting the damn server keep its hands off the 
content if it support byte range requests. IOW: Find another red herring 
to drag.

> This means that there is a potential creep
> of at least 1 byte per EOL in the file. A client asking to resume a
> transfer at byte 900,000 only has the data stream it was receiving to go
> by. This means that the server has to completely re-read the file,
> translating line ends and looking for the "virtual" 900,000th byte as
> rendered by the server. It's not simply a matter of jumping forward in the
> file 900,000 bytes and resuming reading.

That is the server's author's problem problem. Really. If the server 
author is stupid enough to try and *PARSE* a document when not 
explicitly requested to so - they deserve all the headaches they bring on 
themselves. They are clearly violating the intent of '8 bit clean' by 
saying '8 bit clean, except if we think you messed up your end of lines 
we are going to re-write them, so you can't reliably use \x0a and \x0d 
because we might change their number or order without warning you. Sorry 
about that.'

[deleted special purpose solution for PDF documents]

Pull the blinders back off. IGNORE PDF. There is a general problem with 
restarting partially transmitted documents that that is just a special 
case of. We need a method of saying *for any document what-so-ever*: 
"Send me bytes 10000 through 20000".

-- 
Benjamin Franz

Received on Sunday, 12 November 1995 08:53:27 UTC