Re: Comments on the HTTP/1.0 draft. from Chuck Shotton on 1994-11-30 (ietf-http-wg@w3.org from October to December 1994)

From: Chuck Shotton <cshotton@oac.hsc.uth.tmc.edu>
Date: Wed, 30 Nov 1994 09:51:13 -0600
To: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <ab024dff010210047e1e@[129.106.30.2]>

>Thus wrote: "Roy T. Fielding"
>>Marc VanHeyningen writes:
>>> Rather egregiously missing is a reference to transmitting network
>>> objects in canonical form.  Section 3.2 should mention this; a
>>> reference to the canonical encoding model in Appendix G of RFC 1521
>>> (specifically step 2) probably should suffice.  The only place this is
>>> hinted at is in the tolerance section of the appendices on tolerance
>>> of broken implementations, but the spec should explicitly say what the
>>> proper behavior is, just in case any servers every actually do that. :-)
>>
>>The specified behavior will be "no canonical encoding of the object-body
>>is required before network transfer via HTTP, though gateways may need
>>to perform such canonical encoding before forwarding a message via a
>>different protocol.  However, servers may wish to perform such encoding
>>(i.e. to compensate for unusual document structures), and
>>may do so at their discretion."
>
>I must not be understanding what you're saying correctly.  Why is
>canonical encoding unnecessary?  Do you really mean that any server,
>on any architecture, can (for example) transmit text files using
>whatever its local system convention for line breaks might happen to
>be (CR, LF, CRLF, whatever) without standardizing it?  How can we be
>passing local forms around between different machines and expect it to
>work reliably?

Issues such as end of line interpretation have been a sore point between
HTTP clients and servers for a long time, because stdio on Unix only
accomodates LF line ends, Macs store text files with CR for EOL, and
Windows uses CR/LF. It has taken over a year for the general community of
clients and servers to become tolerant of all the line end variations. An
explicit statement in the standard about tolerant EOL representations would
be good.

IMHO, it should state that CR, LF, and CRLF should all be interpreted
equally as EOL when used as line ends. This avoids any problems with
machine dependent EOL symbols, and fairly represents the current practice.
(It also avoids forcing clients and especially servers to do line-by-line
translations of EOL for all outgoing response information, which is a BIG
performance hit.)

>Yes, I know that pretty much all existing servers run under UNIX...

This is a fallacy and should not in any way, shape, or form be allowed to
color the HTTP standard. There are FAR more CPUs hooked to the Internet
running Windows and Mac O/S than Unix, and there's a huge number of servers
on these platforms as well. (I know this wasn't the intent of your entire
statement, but I couldn't resist hopping on a stump for a minute.) As you
say, the standard needs clarification on the point of line ends and the de
facto Unix implementation is not sufficient.

--_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-
Chuck Shotton                             \
Assistant Director, Academic Computing     \   "Shut up and eat your
U. of Texas Health Science Center Houston   \    vegetables!!!"
cshotton@oac.hsc.uth.tmc.edu  (713) 794-5650 \
_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-\-_-_-_-_-_-_-_-_-_-_-_-_-

Received on Wednesday, 30 November 1994 07:53:28 UTC