- From: Chuck Shotton <cshotton@oac.hsc.uth.tmc.edu>
- Date: Wed, 30 Nov 1994 09:51:13 -0600
- To: Marc VanHeyningen <mvanheyn@cs.indiana.edu>
- Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
>Thus wrote: "Roy T. Fielding" >>Marc VanHeyningen writes: >>> Rather egregiously missing is a reference to transmitting network >>> objects in canonical form. Section 3.2 should mention this; a >>> reference to the canonical encoding model in Appendix G of RFC 1521 >>> (specifically step 2) probably should suffice. The only place this is >>> hinted at is in the tolerance section of the appendices on tolerance >>> of broken implementations, but the spec should explicitly say what the >>> proper behavior is, just in case any servers every actually do that. :-) >> >>The specified behavior will be "no canonical encoding of the object-body >>is required before network transfer via HTTP, though gateways may need >>to perform such canonical encoding before forwarding a message via a >>different protocol. However, servers may wish to perform such encoding >>(i.e. to compensate for unusual document structures), and >>may do so at their discretion." > >I must not be understanding what you're saying correctly. Why is >canonical encoding unnecessary? Do you really mean that any server, >on any architecture, can (for example) transmit text files using >whatever its local system convention for line breaks might happen to >be (CR, LF, CRLF, whatever) without standardizing it? How can we be >passing local forms around between different machines and expect it to >work reliably? Issues such as end of line interpretation have been a sore point between HTTP clients and servers for a long time, because stdio on Unix only accomodates LF line ends, Macs store text files with CR for EOL, and Windows uses CR/LF. It has taken over a year for the general community of clients and servers to become tolerant of all the line end variations. An explicit statement in the standard about tolerant EOL representations would be good. IMHO, it should state that CR, LF, and CRLF should all be interpreted equally as EOL when used as line ends. This avoids any problems with machine dependent EOL symbols, and fairly represents the current practice. (It also avoids forcing clients and especially servers to do line-by-line translations of EOL for all outgoing response information, which is a BIG performance hit.) >Yes, I know that pretty much all existing servers run under UNIX... This is a fallacy and should not in any way, shape, or form be allowed to color the HTTP standard. There are FAR more CPUs hooked to the Internet running Windows and Mac O/S than Unix, and there's a huge number of servers on these platforms as well. (I know this wasn't the intent of your entire statement, but I couldn't resist hopping on a stump for a minute.) As you say, the standard needs clarification on the point of line ends and the de facto Unix implementation is not sufficient. --_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_\_-_-_-_-_-_-_-_-_-_-_-_-_-_-_- Chuck Shotton \ Assistant Director, Academic Computing \ "Shut up and eat your U. of Texas Health Science Center Houston \ vegetables!!!" cshotton@oac.hsc.uth.tmc.edu (713) 794-5650 \ _-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-_-\-_-_-_-_-_-_-_-_-_-_-_-_-
Received on Wednesday, 30 November 1994 07:53:28 UTC