Re: Comments on Byte range draft from Chuck Shotton on 1995-11-13 (ietf-http-wg@w3.org from October to December 1995)

From: Chuck Shotton <cshotton@biap.com>
Date: Mon, 13 Nov 1995 11:43:29 -0600
To: Benjamin Franz <snowhare@netimages.com>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <v02130512accd2c69fa57@[198.64.246.22]>
At 8:53 AM 11/13/95, Benjamin Franz wrote:
>On Sun, 12 Nov 1995, Gavin Nicol wrote:

>On consideration - the EOL conversion issue is just another red herring.

Look, can we cut out the "red herring" red herrings? I want to see some
constructive discussion about why a CGI-based implementation of byte ranges
is unacceptable. Nobody has presented a coherent argument against it. I
would like to see some rationale behind why the URL standard needs to be
changed to support an application-specific form of query for a specific
subset of data types. I'd like to see a well-reasoned discussion on your
part about the valid concerns that server implementors have regarding the
need to re-render entire documents to serve you a byte range for low-value
reasons like interrupted file transfers, etc. and why you think it is OK to
ignore these concerns.

>It simply doesn't matter to the question of byte ranges. Byte ranges
>*clearly* should apply to the transmitted byte stream - not the server side
>representation of said byte stream. And one way or another the byte
>stream is *always* generated. And unless your conversion is
>non-deterministic the byte-stream will always be the same from the same
>source document for a given GET request. The byte is inherently the
>atomic unit of information in HTTP.

This is a trivial academic point. Please stop fixating on it. Of course all
data flows across a HTTP connection as a linear stream of bytes. That's the
whole legacy of a Von Neumann architecture. Now that we are past Data Comm
101, let's talk about whether or not it makes sense to force byte stream
manipulations on applications that deal with pages, images, database
records, complicated relational objects, real-time generated displays, etc.


You seem to want to trivialize the function of the Web and associated
applications to the lowest common denominator of moving the contents of a
file from point A to point B, a byte at a time. This is an absurdly
low-level of abstraction. It's like talking about building a house by
aligning protein and sugar molecules so that they form 2x4s, lining up iron
atoms into nails, etc.

There is a LOT more to the problem space than simply bytes, and a
one-size-fits-all byte range proposal is unsuitable from an implementation
perspective as well as a philosophical one. Clients have no business
knowing about the internal representation of items stored at a given URL
and allowing them to ask for specific chunks out of a file with something
as coarse as a byte range violates the entire principle of a URL.
Specifically, a URL is supposed to be a client-opaque method for requesting
information from a server. URLs are given to clients by servers or by
humans. The path information contained in a URL is supposed to be the
private domain of the server. Allowing a client to generate or manipulate
this portion of a URL outside the bounds of the current hierarchy
manipulations allowed in the URL standard is WRONG.

As desirable as it may seem, there are serious integrity and consistency
problems that this opens up for the Web at large if this becomes an
integral part of the URL standard. On the otherhand, if it is a convention
that is adopted through the simple addition of a conforming CGI, the server
retains the ability to tell the client how to access information it
controls, the client gains the ability to access portions of documents, and
the overall model that governs how URLs are manipulated is maintained.

This is all jerk-off verbage and I apologize for contributing my own dreck
to the torrent in this thread. However, there are some serious conceptual
problems and some difficult implementation problems raised by naively
shoving byte ranges into the URL syntax, and I am concerned that these
issues are being glossed over or ignored in a frenzy to get something
pasted into a standards document.

> What connects a server to a client
>*is* a byte stream. End point representations of that byte stream
>should be completely irrelevant to the issue of byte ranges.

Well, technically it's a bit stream. But even so, why are we continuing to
twaddle about in the low level bits and bytes of this problem space when we
could be building easier to use application level standards? If you are
intent on forcing server implementors to pre-render all content so you can
be served a "byte stream" of particular offset and length, why not ask for
something that makes a little more sense in the context of the WWW, like
pages, images, records, etc.? And what's more, why not allow servers to
tell the client FIRST what representation is has chosen for partial
document transfers (byte, page, record, etc.) and tell the client HOW to
make the request for these document portions? This last point is, I think,
the most important one to consider.

Aw heck, let's just replace URLs with SQL queries and be done with it.
That's where this whole discussion is heading anyway.

--_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
Chuck Shotton                               StarNine Technologies, Inc.
chuck@starnine.com                             http://www.starnine.com/
cshotton@biap.com                                  http://www.biap.com/
                 "Shut up and eat your vegetables!"
Received on Monday, 13 November 1995 09:48:18 UTC