Re: Comments on Byte range draft from Chuck Shotton on 1995-11-12 (ietf-http-wg@w3.org from October to December 1995)

From: Chuck Shotton <cshotton@biap.com>
Date: Sun, 12 Nov 1995 11:23:59 -0600
To: Benjamin Franz <snowhare@netimages.com>
Cc: Gavin Nicol <gtn@ebt.com>, montulli@mozilla.com, fielding@avron.ICS.UCI.EDU, masinter@parc.xerox.com, ari@netscape.com, john@math.nwu.edu, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <v02130506accbdd4938c7@[198.64.246.22]>
At 9:59 AM 11/12/95, Benjamin Franz wrote:
>On Sun, 12 Nov 1995, Chuck Shotton wrote:
>
>> >On Sun, 12 Nov 1995, Gavin Nicol wrote:
>> Only in the case of binary files does the byte stream transmitted by the
>> server stand a chance of being identical to what what it has stored locally
>> on disk. In the case of multi-fork or multi-part files, this won't be the
>> case. In the case of HTML files, for example, end of line termination ruins
>> the whole byte range theory. Many servers politely convert their machine
>> specific EOL sequence into a normalized version (LF or CR/LF) for the
>> transmission of text-only files.
>
>This is a severely broken behavior by a server. And IRRELEVANT. Since
>byte range requests are not recognized by todays servers - you obviously
>cannot break one by insisting the damn server keep its hands off the
>content if it support byte range requests. IOW: Find another red herring
>to drag.

Oh, servers that do ISO translations, Shift-JIS translations, or line end
translations that conform to the MIME standard are "severely broken?"
Methinks not. Perhaps you should study the available range of servers out
there before dismissing this particular issue as a "red herring." Server
side document translation is a HUGE issue, and many servers support it. The
point that the local storage representation of a document is not the same
as the content transmitted to a client is a very real and valid one.
Overlooking this issue or dismissing it shows a very narrow understanding
of server implementation issues and the state of server development as it
exists today. Please take a little more time to understand what is going on
here before you unilaterally decide that your position is correct.

>> This means that there is a potential creep
>> of at least 1 byte per EOL in the file. A client asking to resume a
>> transfer at byte 900,000 only has the data stream it was receiving to go
>> by. This means that the server has to completely re-read the file,
>> translating line ends and looking for the "virtual" 900,000th byte as
>> rendered by the server. It's not simply a matter of jumping forward in the
>> file 900,000 bytes and resuming reading.
>
>That is the server's author's problem problem. Really. If the server
>author is stupid enough to try and *PARSE* a document when not
>explicitly requested to so - they deserve all the headaches they bring on
>themselves. They are clearly violating the intent of '8 bit clean' by
>saying '8 bit clean, except if we think you messed up your end of lines
>we are going to re-write them, so you can't reliably use \x0a and \x0d
>because we might change their number or order without warning you. Sorry
>about that.'

There is NOTHING in any standard or convention that says a server cannot
convert content prior to sending it to a client. Especially when that
client is served out of something other than the cannonical Unix file
system. One day everyone will realize that life on the Internet is NOT a
Unix server, NOT a Unix file system, and certainly NOT a Unix client. Your
assumption that the server just blindly tosses data with no interpretation
to the client is naive at best and points out exactly why I am raising this
as an issue.

Too many people assume wrongly that the data served by a HTTP server is
stored on the server's local storage medium in exactly the same
byte-for-byte format as it is transmitted to the client. This is an
incredibly wrong assumption and it will become moreso as the complexity of
server-side objects increase and the amount of translated or generated
content increases. The entire byte range concept is rooted (mired) in the
concept that WWW servers sit on top of (Unix) file systems. The simple fact
is that they don't. And more and more the trend is away from file system
document trees because of the semantically poor representation structure
they provide. Servers residing on databases will likely be the norm within
a year or so, and then what does a byte range URL get you?

>[deleted special purpose solution for PDF documents]
>
>Pull the blinders back off. IGNORE PDF. There is a general problem with
>restarting partially transmitted documents that that is just a special
>case of. We need a method of saying *for any document what-so-ever*:
>"Send me bytes 10000 through 20000".

Perhaps you should take YOUR blinders off and realize that HTTP transmitted
content is not a simply byte stream as far as the server is concerned. The
data objects are more complex than a simply binary file on disk, the
operations and translations performed on them are more involved than
blindly reading content off the disk and squirting it onto the net, and
that server implementors have a lot more to be concerned with than
implementing an inefficient, partial solution to the problem you describe.

I would turn the problem around by asking you why clients only have partial
documents to contend with? You seem to want to implement Zmodem file
transfers on top of HTTP, with the ability to resume an interrupted
transfer. The net is not a modem connection. TCP/IP is ostensibly a
reliable delivery mechanism. You either get all the data or you don't. So,
why are you getting partial files? Is your client broken? If so, that
doesn't seem to warrant a change to the URL standard to fix it. Is the
server broken? Ditto. Is your net connection flakey? Again, not a HTTP or
URL problem.

If the issue is to deliver portions of an entire document because that
portion is a recognizably distinct object that the browser can deal with, I
say let the server specify how those parts are to be requested and
delivered. This is a much more rational, useful reason for byte range
extensions to exist. Trying to justify them with some specious argument
about resumed file transfers is perhaps the biggest red herring of all.

And trying to anticipate every server-side representation scheme by
generalizing everything to a byte stream takes us back to 1975 when Xmodem
showed up on the scene. LIFE IS NOT A BYTE STREAM. The Web already deals
with more complex objects than this. The semantics of these objects are
arguably outside the scope of the HTTP and URL standards and should be
negotiated between the applications that produce and consume these objects,
not the underlying transport protocol or the static addressing scheme for
these objects.

Let's talk about matched sets of viewers/CGIs instead of warping every
standard under the sun and getting into endless academic standards
discussions. The existing standards are already more than sufficient to
support any type of client/server application you can think up.  But, you
must be willing to accept a paradigm shift away from trying to
legislate/mandate everything within the context of a narrow standard
towards a model where the application-level protocols that ride on top of
these standards cooperate to provide a more complicated exchange of info
than the underlying protocols can hope to represent.

It's fun to wave your hands and tell everyone what to do because some
overengineered standard says so. It's a lot more fun to cooperate with
others on the net to build something bigger than a few paper documents by
making APPLICATIONS cooperate. And it's a heck of a lot faster and easier
to do, too. You, I, and everyone else on this list are not wise enough or
experienced enough to anticipate the possible future directions that
information technologies will take even in the next year. Rather than
overengineer a nice, simple standard like the HTTP or URL standard, why not
accomplish the same thing within the context of the existing standard,
simply by build applications that cooperate within that context?

What is your argument against a CGI/viewer solution? What is the argument
FOR a generalized URL based solution? These are the questions that need to
be answered.

I've presented a valid set of problems with the current byte range
proposal. While you may have dismissed them, it doesn't diminish their
magnitude, nor do you provide a viable solution for those problems. Please
provide some constructive solutions instead of simply throwing rocks at a
workable alternative.

--_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
Chuck Shotton                               StarNine Technologies, Inc.
chuck@starnine.com                             http://www.starnine.com/
cshotton@biap.com                                  http://www.biap.com/
                 "Shut up and eat your vegetables!"
Received on Sunday, 12 November 1995 09:30:34 UTC