Re: Comments on the HTTP/1.0 draft. from Chuck Shotton on 1994-12-05 (ietf-http-wg@w3.org from October to December 1994)

From: Chuck Shotton <cshotton@oac.hsc.uth.tmc.edu>
Date: Sun, 4 Dec 1994 20:22:48 -0600
To: Marc VanHeyningen <mvanheyn@cs.indiana.edu>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <ab08241304021004f7ad@[129.106.201.2]>
>>Not at all. Show me a single client that doesn't already do this. As Roy
>>says, it is also an issue of scale. A client can more effectively do this
>>translation once for a single user than a server that must do it thousands
>>of times an hour for all users.
>
>This is assuming servers do this on the fly.  While this is one way
>they might do this, it certainly is not the only way.  A clever server
>might store the document in canonical form after converting it once,
>or cache frequently-requested documents after the conversion, or
>whatever.  If it's important to do, it can be implemented efficiently;
>the question is whether it's important.

A clever server might do a lot of things that require a huge amount of
extra programming for a diminishing return. I think requiring a server to
do something as dorky as translate line ends is ABSURD when a single line
of code added to any client that doesn't already have it accomplishes the
same thing.

I am really having a difficult time understanding the side of this argument
that says "life is a linefeed." I don't think I am being unreasonable in
the points I am making, which advocate tolerance and flexibility. There
seems to be a contingent opposed to flexibility in HTTP. This is inherently
a Bad Thing (tm). There are a LOT of ways to send a request to a HTTP
server besides Mosaic or NetScape, and there are a LOT of applications that
might respond to a HTTP request besides NCSA and CERN httpd. Accomodating a
wide range potential input only makes the HTTP protocol stronger and of
wider appeal.

Anal retentive standards tweaking that legislates absolute, strict
behaviors guarantees that a lot of software will break or otherwise be
incompatible. Defining a standard which accomodates standard practice and
encourages support for the wide variety of clients, servers, and other
types of applications that may participate in the WWW is going to mean a
standard that lasts a lot longer.

>In any case, if you are sending text with something other than CRLFs
>as line breaks, you are not sending text/plain; it's something else.
>The definition of text/plain is very clear on this point.  I merely
>believe that, whatever it is, it should be clearly labeled.

I am not even talking about text/plain. I am discussing the majority of
text returns by HTTP servers, namely text/html. And, I think you will find
that text/plain has flexibility in its interpretation as well and is not as
absolute as you would like it.

>>Wouldn't be a very efficient standard when implemented. Every compiler I'm
>>aware of supports multiple representations for EOL. Why shouldn't the
>>parsers associated with HTTP and HTML be equally tolerant? HTML files are
>>"source code" for the HTTP "compiler."
>
>HTML and HTTP are orthogonal.  And the issue of transmitting objects
>in canonical form is not just about text, though that's the most
>prominent example.

What do you mean, "orthogonal"? HTML is a specific syntax which is a subset
of the basic SGML standard. HTTP is a transport protocol for a myriad of
different data types used for requests and responses between clients and
servers implementing the standard. You can easily argue that the two are
completely unrelated. I'm sure that companies like Adobe and Frame could
care less about HTML but are avid supporters of HTTP. I suspect you'll find
HTTP clients and servers long after HTML has fallen by the wayside.
However, this entire discussion has nothing to do with reviewing the draft
HTTP 1.0 standards document.

>>>It would be nice of HTTP and HTML standards agreed on the treatment
>>>of line breaks in text/html....
>
>Indeed it would be.  However, MIME-Version 1.0 requires that all
>textual subtypes have line breaks represented as CRLFs, so the
>decision is pretty easy unless we want to register it as
>application/html.

So, if we adopt the most restrictive interpretation of the MIME standard
for text (which no HTTP clients or servers currently do) only the Windows
servers and clients will have the remotest chance of reasonable
performance. Is this what you want? If so, WHY? WHY does it matter whether
multiple EOL tokens are supported. If it makes clients and servers more
robust to support all possible variations, and it's easier to implement and
provides higher performance, how can you argue that a standard that would
require universal modification to all clients and servers and result in a
corresponding performance decrease is actually better?

>>I agree... as long as it accomodates all the representations for EOL in
>>current practice. The current attitudes towards this seem to be very
>>Unix-centric and this is very wrong.
>
>And requiring all clients in existence, regardless of what platform
>they run on, to understand the UNIX conventions for line breaks in
>text is not UNIX-centric?  Huh?

That's not what I'm advocating at all. Perhaps you need to read over this
thread again. I'm saying that any EOL token, CR, CRLF, or LF should be
allowed. Only one of these accomodates Unix. One accomodates Macs, and one
accomodates Windows/DOS. (Who can say what a standard text file is on a Vax
or IBM mainframe?)

>>It won't be long before we see HTTP
>>servers that have NOTHING to do with a local file system and reside on top
>>of a DBMS or some other non-traditional object store. I'm not aware of ANY
>>commercial DBMS implementations that use LF as EOL. This diverges from the
>>topic a bit, but I'm trying to make a point that it is NOT sufficient to
>>accomodate only a portion of the platforms in use (e.g., Unix) in the
>>standard as they will represent a decreasing proportion of Web platforms as
>>the Web grows.
>
>I absolutely agree; over time, more and more different platforms will
>be used in widely varying ways.  But I don't think this supports your
>position; quite the opposite.  This is why I shy away from codifying
>into the standard (is this intentended to be an Internet
>standards-track protocol?) UNIX-centrisms or requirements that all
>implementations understand the conventions used by every different
>system there is.

Marc, I don't know your background, and I am truly NOT throwing rocks. But
have you ever implemented a Web client or server? Are you aware of the
performance degradation that will be incurred by forcing all clients and
all servers to parse every byte of text that passes through their buffers?
"Pre-compiling" documents to a standard format is an unacceptable and
unnecessary burden to place on users and will cause unending errors
everytime a user forgets. Forcing servers to perform these text
transformations on the fly is an unnecessary burden on CPU resources and
contrary to what you may think, it is a non-trivial impact. Forcing clients
to change is equally inconvenient. Things work fine now.

All I am advocating is that the status quo be documented as part of the
standard. I am under the impression that this is the goal of the HTTP 1.0
standard anyway. You are advocating something that is NOT current practice,
would require rework of ALL HTTP code in all clients and all servers, and
would result in decreased performance, all in the name of "standards
purity." I ask you to consider your position again in light of this.

-----------------------------------------------------------------------
Chuck Shotton
cshotton@oac.hsc.uth.tmc.edu                           "I am NOT here."
Received on Sunday, 4 December 1994 18:21:59 UTC