Re: Multiple Content-Location headers from Einar Stefferud on 1998-01-16 (ietf-http-wg@w3.org from January to March 1998)

From: Einar Stefferud <Stef@nma.com>
Date: Thu, 15 Jan 1998 16:51:09 -0800
To: Jim Gettys <jg@pa.dec.com>
Cc: Jacob Palme <jpalme@dsv.su.se>, Nick Shelness <shelness@lotus.com>, IETF working group on HTML in e-mail <mhtml@segate.sunet.se>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <24705.884911869@nma.com>
Hi Jim -- 

I hope the garbling of all that included text was an accident;-)...  I
was gretly revlied to discover that there were no comments inserted
there-in;-)...  (I trust such garbelling is not considered to be a
useful feature of WEB Mail UAs;-)...

Next, I think we may be out of synch in the discussion.

MHTML folk almost immediately gave up on the ideas of allowing
multiple Content-location headers, or of giving them multiple
values...  

This is no longer any kind of an issue between HTTP and MHTML!!!!
We are now looking for two other new things:

1.  Are there any other gotchas lurking in the HTTP/MHTML wood pile
    that we have not noticed before, since all us woodpile residents
    would like to avoid all possible hidden gotchas???

2.  Can we use the new idea articulated by Nick Shelness to use a new
    Content-Label header, or allow a new Content-Alternate-Location?

I think MHTML is leaning toward Content-Alternate-Location, but lets
consider both in looking for gotchas.

Cheers...\Stef

>From your message Thu, 15 Jan 1998 12:57:56 -0800:
}
}>  From: Jacob Palme <jpalme@dsv.su.se> >  Date: Thu, 15 Jan 1998 20:55:42
}+0100 >  To: Nick Shelness <shelness@lotus.com>, jg@pa.dec.com (Jim Gettys)
}>  Cc: IETF working group on HTML in e-mail <mhtml@SEGATE.SUNET.SE>, >
}http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com >  Subject: Re: Multiple
}Content-Location headers >  >  At 17.21 +0000 98-01-15,
}Nick_Shelness@motorcity2.lotus.com wrote: >  > Could I suggest that to break
}this impasse, that MHTML switches to a new >  > header field Content-Label
}to replace its use of Content-Location. This >  > would better capture the
}MHTML role of the header field, and would also >  > allow the simplifications
}I argued for last week on the MHTML list to >  > proceed. I.e., Content-Label
}could only specify an absolute URI, and would >  > not establish a base.
}>  >  I am not very happy with changing an existing and already implemented
}>  IETF proposed standard in such a radical way. But maybe it is necessary.
}>  Let us examine the differences between how MHTML and HTTP uses Content-
}>  Location to see if they really need to be split into two different >
}header fields. >  >  HTTP 1.1 spec says                  MHTML spec says
}(I have removed >                                      the controversial
}text allowing >                                      multiple Content-Location
}headers, >                                      since we all agree to remove
}>                                      this.) >  >  In HTTP, multipart
}body-parts MAY   A Content-Location header >  contain header fields which
}are     specifies an URI that labels the >  significant to the meaning of
}that  content of a body part in whose >  part. A Content-Location header
}heading it is placed. Its value >  field SHOULD be included in the     CAN
}be an absolute or a relative >  body-part of each enclosed entity   URI.
}>  that can be identified by a URL. >
}A Content-Location header field is >
}allowed in any message or content >
}heading, in addition to one >                                      Content-ID
}header (as specified in >                                      [MIME1])
}and, in Message headings, >                                      one Message-ID
}(as specified in >                                      [RFC822]) >  >
}The Content-Location entity-header  An URI in a Content-Location >  field
}MAY be used to supply the     header need not refer to an >  resource location
}for the entity    resource which is globally >  enclosed in the message
}when that  available for retrieval using this >  entity is accessible from
}a         URI (after resolution of relative >  location separate from
}the          URIs). However, URI-s in >  requested resource's URI.
}Content-Location headers (if >                                      absolute,
}or resolvable to >                                      absolute URIs) SHOULD
}still be >                                      globally unique. >  >  A
}cache cannot assume that an       When processing (rendering) a >  entity
}with a Content-Location      text/html body part in an MHTML >  different
}from the URI used to      multipart/related structure, all >  retrieve it
}can be used to respond  URIs in that text/html body part >  to later requests
}on that Content-  which reference subsidiary >  Location URI. However, the
}Content- resources within the same >  Location can be used to
}multipart/related structure SHALL >  differentiate between multiple
}be satisfied by those resources >  entities retrieved from a single    and
}not by resources from any >  requested resource, as described    another
}local or remote source. >  in section Caching Negotiated >
}Responses.                          Therefore, If a sender wishes a
}>                                      recipient to always retrieve an >
}...                                 URI referenced resource from its
}>                                      source, an URI labeled copy of >
}If a single server supports         that resource MUST NOT be included >
}multiple organizations that do not  in the same multipart/related >  trust
}one another, then it must     structure. >  check the values of Location
}and >  Content-Location headers in         In addition, since the source
}of a >  responses that are generated under  resource received in >  control
}of said organizations to    multipart/related structure can be >  make sure
}that they do not attempt  misrepresented (see 12.1 above), >  to invalidate
}resources over which  if a resource received in >  they have no
}authority.             multipart/related structure is
}>                                      stored in a cache, it MUST NOT be
}>                                      retrieved from that cache other
}>                                      than by a reference contained in
}a >                                      body part of the same
}>                                      multipart/related structure.
}>                                      Failure to honor this directive
}>                                      will allow a multipart/related
}>                                      structure to be employed as a
}>                                      Trojan Horse. For example, to
}>                                      inject bogus resources (i.e. a
}>                                      misrepresentation of a
}>                                      competitor's Web site) into a
}>                                      recipient's generally accessible
}>                                      Web cache.
}>
}>  My feeling is that the use of Content-Location as defined in the HTTP
}>  and MHTML spec is not so different as to require us to use different
}>  headers. But could the HTTP people please examine the quotes above
}>  and check what you feel about this.
}>
}
}The problem we have is syntax and implementation, not semantics.
}Lets clear this hurdle before we get into the meat of what you are trying
}to achieve, and whether your suggestion fits into the architecture of the
}Web, and my apologies of jumping into the meat in some of my early messages
}on this topic.
}
}Roy Fielding's point is that the syntax change required to allow the header
}name Content-Location to have multiple fields (needed as that is what proxies
}typically do if they find multiple headers of the same name), is a problem,
}and one that may (likely) break exisiting implementations.  It is also
}possible/likely this would break existing applications of HTTP, particularly
}clients and proxies.  To include the URI in a comma separated list would
}require quoting of the URI's, as Roy points out; parsers may not be coded
}correctly to deal with this.  It is quite likely that existing implementations
}will get the wrong answer, or even die, if one attempts to have multiple
}Content-Location headers, or that would not understand the quoting that
}this would require.  And then there are the proxy issues....
}
}To quote from section 4.2 of the HTTP spec:
}
}"Multiple message-header fields with the same field-name may be present in
}a message if and only if the entire field-value for that header field is
}defined as a comma-separated list [i.e., #(values)]. It MUST be possible
}to combine the multiple header fields into one "field-name: field-value"
}pair, without changing the semantics of the message, by appending each
}subsequent field-value to the first, each separated by a comma. The order
}in which header fields with the same field-name are received is therefore
}significant to the interpretation of the combined field value, and thus
}a proxy MUST NOT change the order of these field values when a message is
}forwarded."
}
}These are the cruxes of the problem.  So we're trying to follow the doctor's
}maxim "first, do no harm". We aren't worrying (yet) about the semantic issues
}that may or may not exist between how Content-Location is defined in the
}two different specs, but pointing out that allowing multiple of
}Content-Location headers is an incompatible change which may break
}implementations, and we have no data which shows this change is harmless.
}
}So until it is shown to be harmless, we must presume harm.  IETF process
}attempts to avoid regression; we're worried that existing, deployed software
}would stop working, possibly in significant ways.
}
}So, please, as in my previous message, either present data that it
}doesn't break implementations, or don't argue about the name.  Otherwise
}we're going to continue to bog down.  I think that will let us all
}make faster progress.
}
}I hope this clarifies where the difficulty lies.
}
                        - Jim Gettys
}
}
}--
}Jim Gettys
}Industry Standards and Consortia
}Digital Equipment Corporation
}Visting Scientist, World Wide Web Consortium, M.I.T.
}http://www.w3.org/People/Gettys/
}jg@w3.org, jg@pa.dec.com
Received on Thursday, 15 January 1998 17:47:07 UTC