Message-Id: <9212111047.AA29250.guido@voorn.cwi.nl> To: Dan Connolly <connolly@pixel.convex.com> Cc: www-talk@nxoc01.cern.ch Subject: Re: Gopher+ Considered Harmful In-Reply-To: Your message of "Thu, 10 Dec 1992 12:05:02 MET." <9212101805.AA05022@pixel.convex.com> From: Guido.van.Rossum@cwi.nl Date: Fri, 11 Dec 1992 11:47:53 +0100 I wrote: >>As I see it, there are two possible ways of using MIME in HTTP+. We >>can either support MIME as the *only* data format (implementing any >>extensions we need as new MIME content types or subtypes or additional >>headers), or we we support MIME as one of the possible data formats. Dan's reply: >A terminology note here: there is no one "MIME data format." There's >the ubiquitous message/rfc822 format that you can stick anything >inside using MIME techniques. But the basic unit of information >in the MIME spec is an _entity_ -- just an arbitrary stream of bytes. OK, when I said MIME data format I meant MIME message format, and was referring to the outer level only (and note that MIME *implies* RFC822). I certainly did not refer to a particular content-type, not even to message/rfc822. The only thing that isn't well-specified when one talks about "a file in MIME format" is whether line breaks are given as CRLF or as LF (or as something else). >The question is, when you're sending an entity from one >place to another, how do you know where the end is? This is a matter for the transport agent, not for MIME -- by the time you call in the MIME agent to handle the data you must *already* know where the end is. For entities contained in other entities (e.g. the content-type family multipart/*) there is a way defined in MIME to find the end of the inner entities, but this is not true for the outermost entity. >From the MIME point of view, an NNTP client and server have >an implicit agreement that the entity going across the >wire has a content-transfer-encoding of 7bit. > >This allows them to use the dot-on-a-line-by-iteself technique to >terminate the entitiy. MIME and NNTP should never need to talk to each other. MIME is a UA level format, NNTP is a message transfer agent protocol. NNTP can use the dot-on-a-line-by-itself convention not because it is a 7-bit protocol (which it isn't -- although other message transfer protocols like SMTP are) but because it is a line-based protocol. MIME is also mostly a line-based format, even if the content-transfer-encoding is 8bit -- it is only in binary mode that we get in trouble (since conversion from one kind of line terminator to another is dangerous for binary data). >They also share assumptions about the content-type as >a separate issue. The client assumes the response to an >ARTICLE command is a message/rfc822 entity, while the >response to a BODY command is text/plain. That's a nice way of putting it. >[Long description of why you want to put the byte count in the MIME >headers omitted] > >It is somewhat intertwingled, but I still kinda like it. And I still don't. I have the feeling that it would be much easier to adapt HTTP to other (non-TCP) transport protocols if the size of an entity is given separately rather than computed from the entity itself (after all this nonsense is only necessary because TCP doesn't have a way to distinguish EOF from a broken connection). As I understand it your main objection is that under my proposal you will have to construct the necessary headers in a buffer first. I don't believe that this is that much of a hassle in today's computers -- it shouldn't be more than a couple of kilobytes even in extreme cases, which is peanuts even for a standard PC. An issue on which I don't have a strong opinion is whether we should represent line separators as CRLF in the header -- anyone else? Cheers, --Guido van Rossum, CWI, Amsterdam <guido@cwi.nl> "The lawnmower. Surely such a gadget could not have been generated independently in two separate areas."