Re: Comments on the HTTP/1.0 draft. from Marc VanHeyningen on 1994-12-04 (ietf-http-wg@w3.org from October to December 1994)

From: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Date: Sun, 04 Dec 1994 12:34:36 -0500
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <4647.786562476@moose.cs.indiana.edu>
Chuck Shotton said:
>At 11:35 PM 12/1/94, Albert Lunde wrote:
>>However, I think this is too much of a burden to add to client
>>writers.

It might or might not be.  However, we do need to decide whether this
behavior should be described in an appendix on "how to be tolerant of
bad servers" or it should be a required part of the spec.  If it
stays where it is now, a "non-bad server" needs to be defined more
clearly.

(Note that this is also a burden on server authors, unless we require
that the "request" portion be in canonical form but allow the
"response" portion not to be; this would be sort of strange.)

>Not at all. Show me a single client that doesn't already do this. As Roy
>says, it is also an issue of scale. A client can more effectively do this
>translation once for a single user than a server that must do it thousands
>of times an hour for all users.

This is assuming servers do this on the fly.  While this is one way
they might do this, it certainly is not the only way.  A clever server
might store the document in canonical form after converting it once,
or cache frequently-requested documents after the conversion, or
whatever.  If it's important to do, it can be implemented efficiently;
the question is whether it's important.

In any case, if you are sending text with something other than CRLFs
as line breaks, you are not sending text/plain; it's something else.
The definition of text/plain is very clear on this point.  I merely
believe that, whatever it is, it should be clearly labeled.

If we want to invent a way to label it, that's OK; Content-Encoding is
a good way, and since it's not part of MIME we can do whatever we like
with it.

That reminds me, the current draft officially codifies "x-compress"
and "x-gzip" as registered encodings. Since the x-token convention is
normally used to indicate unregistered stuff, I think this should
change to just "compress" and "gzip" [with compliance for the old
labels mentioned, since existing system will still use them.]

(Actually, I think the encodings should be "LZW" and "LZ77", refering
to the actual algorithms rather than the names commonly given to the
UNIX implementations of those algorithms, but that would cause more
confusion than it's worth.)

>Wouldn't be a very efficient standard when implemented. Every compiler I'm
>aware of supports multiple representations for EOL. Why shouldn't the
>parsers associated with HTTP and HTML be equally tolerant? HTML files are
>"source code" for the HTTP "compiler."

HTML and HTTP are orthogonal.  And the issue of transmitting objects
in canonical form is not just about text, though that's the most
prominent example.

>>It would be nice of HTTP and HTML standards agreed on the treatment
>>of line breaks in text/html....

Indeed it would be.  However, MIME-Version 1.0 requires that all
textual subtypes have line breaks represented as CRLFs, so the
decision is pretty easy unless we want to register it as
application/html.

>I agree... as long as it accomodates all the representations for EOL in
>current practice. The current attitudes towards this seem to be very
>Unix-centric and this is very wrong.

And requiring all clients in existence, regardless of what platform
they run on, to understand the UNIX conventions for line breaks in
text is not UNIX-centric?  Huh?

>It won't be long before we see HTTP
>servers that have NOTHING to do with a local file system and reside on top
>of a DBMS or some other non-traditional object store. I'm not aware of ANY
>commercial DBMS implementations that use LF as EOL. This diverges from the
>topic a bit, but I'm trying to make a point that it is NOT sufficient to
>accomodate only a portion of the platforms in use (e.g., Unix) in the
>standard as they will represent a decreasing proportion of Web platforms as
>the Web grows.

I absolutely agree; over time, more and more different platforms will
be used in widely varying ways.  But I don't think this supports your
position; quite the opposite.  This is why I shy away from codifying
into the standard (is this intentended to be an Internet
standards-track protocol?) UNIX-centrisms or requirements that all
implementations understand the conventions used by every different
system there is.
--
Marc VanHeyningen  <URL:http://www.cs.indiana.edu/hyplan/mvanheyn.html>
Received on Sunday, 4 December 1994 09:35:38 UTC