- From: Jim Gettys <jg@pa.dec.com>
- Date: Thu, 15 Jan 1998 12:57:56 -0800
- To: Jacob Palme <jpalme@dsv.su.se>
- Cc: Nick Shelness <shelness@lotus.com>, Jim Gettys <jg@pa.dec.com>, IETF working group on HTML in e-mail <mhtml@segate.sunet.se>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
> From: Jacob Palme <jpalme@dsv.su.se> > Date: Thu, 15 Jan 1998 20:55:42 +0100 > To: Nick Shelness <shelness@lotus.com>, jg@pa.dec.com (Jim Gettys) > Cc: IETF working group on HTML in e-mail <mhtml@SEGATE.SUNET.SE>, > http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com > Subject: Re: Multiple Content-Location headers > > At 17.21 +0000 98-01-15, Nick_Shelness@motorcity2.lotus.com wrote: > > Could I suggest that to break this impasse, that MHTML switches to a new > > header field Content-Label to replace its use of Content-Location. This > > would better capture the MHTML role of the header field, and would also > > allow the simplifications I argued for last week on the MHTML list to > > proceed. I.e., Content-Label could only specify an absolute URI, and would > > not establish a base. > > I am not very happy with changing an existing and already implemented > IETF proposed standard in such a radical way. But maybe it is necessary. > Let us examine the differences between how MHTML and HTTP uses Content- > Location to see if they really need to be split into two different > header fields. > > HTTP 1.1 spec says MHTML spec says (I have removed > the controversial text allowing > multiple Content-Location headers, > since we all agree to remove > this.) > > In HTTP, multipart body-parts MAY A Content-Location header > contain header fields which are specifies an URI that labels the > significant to the meaning of that content of a body part in whose > part. A Content-Location header heading it is placed. Its value > field SHOULD be included in the CAN be an absolute or a relative > body-part of each enclosed entity URI. > that can be identified by a URL. > A Content-Location header field is > allowed in any message or content > heading, in addition to one > Content-ID header (as specified in > [MIME1]) and, in Message headings, > one Message-ID (as specified in > [RFC822]) > > The Content-Location entity-header An URI in a Content-Location > field MAY be used to supply the header need not refer to an > resource location for the entity resource which is globally > enclosed in the message when that available for retrieval using this > entity is accessible from a URI (after resolution of relative > location separate from the URIs). However, URI-s in > requested resource's URI. Content-Location headers (if > absolute, or resolvable to > absolute URIs) SHOULD still be > globally unique. > > A cache cannot assume that an When processing (rendering) a > entity with a Content-Location text/html body part in an MHTML > different from the URI used to multipart/related structure, all > retrieve it can be used to respond URIs in that text/html body part > to later requests on that Content- which reference subsidiary > Location URI. However, the Content- resources within the same > Location can be used to multipart/related structure SHALL > differentiate between multiple be satisfied by those resources > entities retrieved from a single and not by resources from any > requested resource, as described another local or remote source. > in section Caching Negotiated > Responses. Therefore, If a sender wishes a > recipient to always retrieve an > ... URI referenced resource from its > source, an URI labeled copy of > If a single server supports that resource MUST NOT be included > multiple organizations that do not in the same multipart/related > trust one another, then it must structure. > check the values of Location and > Content-Location headers in In addition, since the source of a > responses that are generated under resource received in > control of said organizations to multipart/related structure can be > make sure that they do not attempt misrepresented (see 12.1 above), > to invalidate resources over which if a resource received in > they have no authority. multipart/related structure is > stored in a cache, it MUST NOT be > retrieved from that cache other > than by a reference contained in a > body part of the same > multipart/related structure. > Failure to honor this directive > will allow a multipart/related > structure to be employed as a > Trojan Horse. For example, to > inject bogus resources (i.e. a > misrepresentation of a > competitor's Web site) into a > recipient's generally accessible > Web cache. > > My feeling is that the use of Content-Location as defined in the HTTP > and MHTML spec is not so different as to require us to use different > headers. But could the HTTP people please examine the quotes above > and check what you feel about this. > The problem we have is syntax and implementation, not semantics. Lets clear this hurdle before we get into the meat of what you are trying to achieve, and whether your suggestion fits into the architecture of the Web, and my apologies of jumping into the meat in some of my early messages on this topic. Roy Fielding's point is that the syntax change required to allow the header name Content-Location to have multiple fields (needed as that is what proxies typically do if they find multiple headers of the same name), is a problem, and one that may (likely) break exisiting implementations. It is also possible/likely this would break existing applications of HTTP, particularly clients and proxies. To include the URI in a comma separated list would require quoting of the URI's, as Roy points out; parsers may not be coded correctly to deal with this. It is quite likely that existing implementations will get the wrong answer, or even die, if one attempts to have multiple Content-Location headers, or that would not understand the quoting that this would require. And then there are the proxy issues.... To quote from section 4.2 of the HTTP spec: "Multiple message-header fields with the same field-name may be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list [i.e., #(values)]. It MUST be possible to combine the multiple header fields into one "field-name: field-value" pair, without changing the semantics of the message, by appending each subsequent field-value to the first, each separated by a comma. The order in which header fields with the same field-name are received is therefore significant to the interpretation of the combined field value, and thus a proxy MUST NOT change the order of these field values when a message is forwarded." These are the cruxes of the problem. So we're trying to follow the doctor's maxim "first, do no harm". We aren't worrying (yet) about the semantic issues that may or may not exist between how Content-Location is defined in the two different specs, but pointing out that allowing multiple of Content-Location headers is an incompatible change which may break implementations, and we have no data which shows this change is harmless. So until it is shown to be harmless, we must presume harm. IETF process attempts to avoid regression; we're worried that existing, deployed software would stop working, possibly in significant ways. So, please, as in my previous message, either present data that it doesn't break implementations, or don't argue about the name. Otherwise we're going to continue to bog down. I think that will let us all make faster progress. I hope this clarifies where the difficulty lies. - Jim Gettys -- Jim Gettys Industry Standards and Consortia Digital Equipment Corporation Visting Scientist, World Wide Web Consortium, M.I.T. http://www.w3.org/People/Gettys/ jg@w3.org, jg@pa.dec.com
Received on Thursday, 15 January 1998 13:01:45 UTC