Re: Multiple Content-Location headers from Jim Gettys on 1998-01-15 (ietf-http-wg@w3.org from January to March 1998)

From: Jim Gettys <jg@pa.dec.com>
Date: Thu, 15 Jan 1998 08:34:56 -0800
To: Jacob Palme <jpalme@dsv.su.se>
Cc: Stef@nma.com, Scott Lawrence <lawrence@agranat.com>, IETF working group on HTML in e-mail <mhtml@segate.sunet.se>, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9801151634.AA02626@pachyderm.pa.dec.com>
>  From: Jacob Palme <jpalme@dsv.su.se>
>  Date: Thu, 15 Jan 1998 03:34:33 +0100
>  To: jg@pa.dec.com (Jim Gettys), Stef@nma.com
>  Cc: Scott Lawrence <lawrence@agranat.com>,
>          IETF working group on HTML in e-mail <mhtml@segate.sunet.se>,
>          http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
>  Subject: Re: Multiple Content-Location headers
>  
>  At 08.09 -0800 98-01-14, Jim Gettys wrote:
>  > You can try to make a case that it might happen if the facilities were
>  >there,
>  > of course).  Equally likely in my opinion though is the inverse, that mail
>  > as we know it becomes pretty integrated to the Web, rather than the
>  >inverse.
>  > This message is being composed on a prototype mail system which has many
>  > of these properties already, for example.  All mail messages I get end up
>  > with a URL, the mail user agent only uses HTTP (it is written in Java),
>  > and I can say from first hand experience that this has much to commend it.
>  
>  Why is that an inverse to what the MHTML group is proposing? If mail gets
>  more integrated with the web, that is only more reason that aggregate
>  MIME objects have the same format whatever transport method is used
>  to deliver them! We are also working on a web-based e-mail system, and
>  the very popular HOTMAIL service has the same basis.

We would question that they should be aggregated into a single object.
Collections, along with what the URN folks are trying to do, might
be a more interesting way to achieve the end you are looking for.

>  
>  > But back to the present: Mail archives in the Web are typically handled
>  > by a program that takes mail messages as input and generates HTML as a set
>  > of Web documents. An equally plausible extension to handle mhtml is to
>  > retrieve  the attached documents at the time the HTML is generated from
>  > the mail message, rather than presuming the data is inline. This requires
>  > no protocol support beyond what exists today (though arguably is not as
>  > atomic in nature).
>  
>  The sender of a message can choose to indicate that s/he is sending the
>  full content as it looks like at send time (by including them in the
>  aggregate MIME object sent) or to indicate that the content of the body
>  parts are to retrieved from the web at read time (by only including
>  references to them in the MIME object sent).
>  
>  > Fundamentally, HTTP talks about a single document at a time; this
>  > is inherent throughout the protocol; in the caching sections, and
>  > all over.  All methods take a single URI as an argument; not a list
>  > of URI's.  This presumption is inherent throughout the design.
>  
>  Yes, but a single "document" in HTML very often consists of multiple
>  parts. There was no equivalence between "document" and "file" until
>  we got the MHTML standard.
>  >
>  > There is alot of work on what are called "collections" going on in Webdav.
>  > While I have my reservations on details of what they are proposing, the
>  > concept is cleaner, in my humble opinion: it should be possible to
>  > define a document which is a collection of related documents.
>  > This would fill the scenario you outline, as I understand it,
>  > along with many others.
>  
>  To me it seems much cleaner to archive each document in a single file.
>  Retrieving a document from a backup storage will be much easier
>  if you need only retrieve one single file. The risk that parts
>  get mislaid is also smaller.

Documents != files in the web.  A document may have many representations,
either different Content-type, Content-language, etc.  A single name
may have N variants.

>  >
>  > Note since such collections can have URI's of their own, it fits
>  > well into the model of the Web.
>  
>  Of course a composite MIME object can also have a URI of its own.
>  It is *not* the same as the URI of its start object, since the URI
>  of its start object will display today's weather map, while the
>  composite MIME object will display the weather map of the day
>  when it was generated.
>  
>  > But to try to introduce the idea that an object is compound by its very
>  > nature to the web at this date, and trying to mix the MHTML metadata with
>  > HTTP metadata, even if possible, does not seem feasible or desirable to
>  > me. My complexity alarm is going off...
>  
>  They are already mixed. We are for example using many headers with
>  identical names, like "Content-Base" and "Content-Location", and in that
>  case, of course, we should try to define their syntax and semantics
>  in the same way. If HTTP strongly needs a header field which is not
>  suitable for SMTP or the reverse, these header fields should not
>  be given the same name in HTTP as in objects sent by SMTP.
>  

We are all in absolute agreement here about attempting to not (further) 
defining headers with the same name that are not identical.  This is
in fact at the root of the issue.

Part of the reaction you are seeing is that the HTTP group is trying to 
get to draft standard and is focussed on this intently (the sooner the better), 
but there is another issue you need to understand:

Roy Fielding points out 
(http://www.ics.uci.edu/pub/ietf/http/hypermail/1998q1/0152.html) that your 
proposal would require syntactic change to Content-Location, as specified 
in HTTP/1.1 (RFC 2068).  I (and I believe others) suspect that the syntactic 
changes required would break running, deployed implementations.

HTTP/1.1 is already in very significant deployment. It is essentially 
impossible at this date to introduce incompatible change to the HTTP protocol, 
both on (proper) process grounds, but more importantly on pragmatic grounds 
of not breaking deployed code (which is what the process is attempts to 
ensure). 

This is reality; it has nothing to do with "cleanlyness" in the abstract.

So right now, most people (including myself) in the HTTP group aren't even 
examining the proposal on its merits: with the name Content-Location, it 
is almost impossible on both pragmatic implementation grounds and on IETF 
process grounds (which are driven by these realities of deployment of running 
code) to deal with the proposal.

Now, to go further in this discussion, we need to be convinced that the 
proposal won't break deployed code. The burden of proof is on you at this 
point, as the one proposing the change.  So as I see it, either show that 
1) the change won't break running deployed HTTP code, or 2) change the name 
to something else, so it won't break deployed code, and therefore HTTP might 
be able to implement it.  

Once we get beyond this point, forced by the existing deployment of HTTP/1.1, 
the HTTP working group might actually try to understand the problem and 
proposal in more detail....  Then we might be able to reach a consensus 
on the merits of the case.  But right now we're stuck.

			Regards,
				- Jim Gettys


--
Jim Gettys
Industry Standards and Consortia
Digital Equipment Corporation
Visting Scientist, World Wide Web Consortium, M.I.T.
http://www.w3.org/People/Gettys/
jg@w3.org, jg@pa.dec.com
Received on Thursday, 15 January 1998 08:39:52 UTC