Re: HTTP: T-T-T-Talking about MIME Generation from Roy T. Fielding on 1994-12-19 (ietf-http-wg@w3.org from October to December 1994)

From: Roy T. Fielding <fielding@avron.ICS.UCI.EDU>
Date: Mon, 19 Dec 1994 13:57:21 -0800
To: Marc Salomon <marc@library.ucsf.edu>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9412191357.aa11268@paris.ics.uci.edu>
Marc Salomon writes:

> Indeed, the restrictions on the use of Message-ID: in HTTP/1.0 defeat caching
> WRT MIME.

There are no restrictions on Message-ID in HTTP/1.0 which do not already
exist in the MIME RFC 1521.  I quote [page 19]:

    "Like the Message-ID values, Content-ID values must be generated
     to be world-unique."

The MIME BNF references "msg-id" of RFC 822, which has the EXACT same
syntax as that defined for HTTP/1.0.  A longer explanation of Message-ID
can be found in the USENET spec (RFC 1036).

> [...]
> |
> |It's much easier to just keep the connection open (with possibly a very
> |short timeout as defined by the server) and respond with multiple responses
> |(each one of which may be a multipart entity).
> 
> But TCP is much less efficient if you are sending bunch of small objects--you
> never get out of first gear as it were.  Wasn't it argued that TCP doesn't
> get efficient till it gets up to speed on a data transfer?

No, the slow-start effect only applies to connection setup and immediately
thereafter -- once the window has been determined, it applies for the rest
of the connection.

> Would having to satisfy 
> a set of time-spaced serial requests allow the connection to ever get up to 
> speed?

Yes, though we intend to test that just to be sure.

> |> Message-ID: <http://host.domain/http_mime.html>
> |
> |This is an invalid use of Message-ID -- it must be unique.
> |
> |> Content-ID: <http://host.domain/path/http_mime.html>
> |
> |This is an invalid use of Content-ID -- it must be unique.
> 
> Stupid me.  I had assumed that the draft HTTP spec-00 talked about current 
> practice, but it deals with protocol-as-documented but unimplemented as well.

It documents preferred practice, as in "what the programmers would have
implemented had they checked the existing Internet specifications before
hacking away with abandon."  More importantly, it is intended to reflect
the combined wisdom of "what should be done to implement a good HTTP
application" as defined by the history of www-talk and the participants
in this list, while at the same time not breaking any implementation that
is not already known to be broken.

> The specifications in HTTP/1.0 for both the format and use of the Content-ID: 
> are more restrictive than that of RFC 1521.  The MIME spec says that the
> primary purpose of this field is to assist caching and offers no format
> template, while HTTP/1.0 dedicates this field to a transaction identifier and 
> requires a strict format template.

RTFM.  It requires <string@domain>, as does RFC 822, RFC 1521, and RFC 1036.

> I could make an argument that a URL uniquely identifies the content of a
> body-part in that a URL cannot identify but one object at one time.  The main
> reason for the MIME optional Content-ID field is to allow for caching, which is
> facilitated by the use of a URL (in conjunction with the Date: header) in this 
> field.  The URL would be valid and cachable from Date: till Expires: and is an
> excellent candidate for a RFC 1521 style Content-ID.  A contradictory use for 
> the Content-ID field is specified in HTTP/1.0-draft-00, although not currently 
> used in any implementation.

Bull puckies.  Again, RTFM, and this time look at the BNF (the actual
specification) and not just the verbage.

> The Content-Description header field as specified in RFC 1521 would probably
> be an appropriate place for this, although I do not see a Message-Description 
> field there.  One would be tempted to look to the URI header to indicate the 
> URI associated with a body part, but from what I can tell, it is only used to 
> redirect a request and its use here seems inappropriate.

Crikey! It explicitly states in the last sentence on Section 4.2.1 of
draft 00 (on Multipart Types):  "A URI-header field (Section 7.9) should
be included in the body-part for each enclosed object which can be identified
by a URI."  I can't get any more explicit than that!

> |And what do you do about Content-Encoding?  It's not valid MIME at all.
> 
> and later in the weekend Roy writes:
> 
> |Until the MIME people add Content-Encoding to the official RFCs, there is no 
> |point in even discussing the issue here.  All that we can do is show how they 
> |differ and possibly how an HTTP -> SMTP/NNTP gateway should behave.
> 
> HTTP currently uses:
> 
> Content-Type: application/postscript
> Content-Encoding: x-gzip
> 
> To express a two-level encoding.  Two level encodings are not included in
> MIME because of concerns about machines that are unable to easily perform
> the (de)compression.  Since only machines that are able to (de)compress (should)
> present this information during negotiation, it does not seem problem for the
> web.  In short, current practice of HTTP conflicts with the limitations of 
> MIME, so we could validate the out that the MH people took:
> 
> <comp.mail.mime-FAQ>
> Miscellaneous parameters:
> 
> x-conversions=compress  MH 6.8.x viamail: used with application/octet-stream;
>                           type=tar.  See also tar(1) and compress(1).
> </comp.mail.mime-FAQ>
> 
> The solution for a multipart MIME compliant expression of Content-Encoding 
> would be to use something like one the following:
> 
> Content-Type:  application/postscript
> Content-Transfer-Encoding: binary; x-conversions=gzip

Ummmm, I think that would have to be

  Content-Type:  application/postscript; x-conversions=gzip
  Content-Transfer-Encoding: binary

since CTE doesn't allow parameters. Anybody know if this appears in one
of the MIME-extension drafts currently up for review?  It would be hard
for me to reference a FAQ as the "official" workaround.

It's a pity we can't turn back the clock and redefine Content-Encoding.

> |> Two consecutive boundary strings indicate an EOF.
> |
> |What?  See RFC 1521.  Or, for that matter, read the HTTP/1.0 draft -- it
> |already defines EObody-part and EOentity for Multipart entities.
> 
> It had a hard time jumping out at me while preparing this document, but you are
> again correct.  EOF is indicated by --boundary--.
> 
> % grep -i EObody-part draft-fielding-http-spec-00.txt rfc1521.txt
> % grep -i EOentity draft-fielding-http-spec-00.txt rfc1521.txt
> 
> What are EObody-part and EOentity?

Sorry, that was my short-ref for End-of-body-part and End-of-entity,
where body-part and entity are defined in rfc 1521.

> Since the draft HTTP 1.0 spec-00 is supposed to document current practice in
> the same manner as the HTML 2.0 effort, why is multipart/mixed documented,
> which is most certainly *not* current practice in any implementation I know of
> (is it?) although it is described in BASIC HTTP?

The HTTP/1.0 spec IS NOT supposed to document current practice "in the
same manner as the HTML 2.0 effort."  HTTP has always been capable of
transporting multipart types, its just that the most popular servers and
clients have not done so.

......Roy Fielding   ICS Grad Student, University of California, Irvine  USA
                                     <fielding@ics.uci.edu>
                     <URL:http://www.ics.uci.edu/dir/grad/Software/fielding>
Received on Monday, 19 December 1994 14:04:14 UTC