Re: Comments on the HTTP/1.0 draft. from Albert Lunde on 1994-12-07 (ietf-http-wg@w3.org from October to December 1994)

From: Albert Lunde <Albert-Lunde@nwu.edu>
Date: Wed, 7 Dec 1994 15:22:54 -0600 (CST)
To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <199412072122.AA028875374@casbah.acns.nwu.edu>
>   It's not clear that you're discussing the same thing as everyone else. No
> one disputes the need for the request and response headers to use standard
> EOL encoding. That's a given and everyone understands that this is the
> case. This is a no-brainer because all the headers are generated by clients
> and servers and can always be generated correctly.
> 
> This discussion is about object-bodies ONLY. Frankly, your continued
> arguing for cannonicalization in this area is contrary to A) current
> practice, B) common sense, and C) any perceived need on anyone elses' part.
> I have yet to hear a factual, supported reason why this must be done or
> else HTTP will fail. It isn't done now, everything works great, and nobody
> is forced to waste a bunch of CPU cycles to munge text files to keep a few
> standards junkies happy.
> 
> I apologize if my tone here is unprofessional, but continuing to throw up
> obstacles with respect to the subject of tolerant interpretation of line
> ends without citing any rationale other than "cannonicalization is a Good
> Thing" is wearing thin. Either cite some evidence as to why it's a good
> thing or leave it be. (and references to other protocols are of limited
> value, because most are batch oriented vs. interactive and none except
> gopher experience anywhere near the load of HTTP in terms of transactions
> per hour. Performance is THE major issue here, not squeaky clean, overly
> restrictive standards definitions.)

I think there are two issues:

What we should advocate, and how we should explain it.

The old spec i.e.
<http://info.cern.ch/hypertext/WWW/Protocols/HTTP/HTTP2.html>

references the RFC822 and MIME specs rather briefly, saying
(in part):

Under The request:
"The request is sent with a first line containing the method to be 
applied to the object requested, the identifier of the object, 
and the protocol version in use, followed by further information 
encoded in the RFC822 header style."

Under Response Data it says:
"Additional information may follow, in the format of a MIME message body. 
The significance of the data depends on the status code."

Under Object Contents it says:
"The data (if any) sent with an HTTP request or reply is in a format 
and encoding defined by the object header fields, the default 
being "plain/text" type with "8bit" encoding. Note that while all
the other information in the request (just as in the reply) 
is in ISO Latin1 with lines delimited by Carriage Return/Line 
Feed pairs, the data may contain 8-bit binary data."

Under Client tolerance of bad HTTP servers it says:

"Clients should be tolerant in parsing response status lines, in particular
they should accept any sequence of white space (SP and TAB) characters
between fields.

Lines should be regarded as terminated by the Line Feed, and the
preceeding Carriage Return character ignored."
 
My reading of this is that the old spec seemed to say that
headers and text object bodies should be sent with CR/LF
as end of line. (just like MIME/RFC 822), but that clients
should tolerate other stuff from "bad" servers.

Now, it seems like we are saying is that current practice
(not just "bad" servers) is to treat EOL differently in
the object body for performance reasons.

In this, and in other ways, we are not just quoting the MIME
spec, we are sort of rewriting it.

We need to be clear which parts of MIME we are reusing and
which parts we are rewriting. (Are we redefining EOL
treatment for object bodies of type text/*? 

We might do as well to redefine EOL for the whole message
to remap CR LF or CRLF to EOL before doing any MIME header/body
processing. It might make a spec easier to write.)

There is also the need to make this agree with the HTML spec
which in the current draft is focused on defining HTML as
SGML application, but talks as though it was referring to
the MIME network representation.

-- 
    Albert Lunde                      Albert-Lunde@nwu.edu
Received on Wednesday, 7 December 1994 13:24:32 UTC