- From: Roy T. Fielding <fielding@kiwi.ics.uci.edu>
- Date: Thu, 04 Sep 1997 00:14:51 -0700
- To: mhtml@segate.sunet.se, uri@bunyip.com
I must say that this whole discussion is a bit weird to me. Content-Base takes precedence over Content-Location within the same header field set because that is the only reason why Content-Base exists --- to provide a way to say that the embedded links are relative to something other than the document location. Unfortunately, I can't see the confusion because I wrote the words. Given the choice of removing the recursive definition or removing Content-Base, I would remove Content-Base without a second thought. The recursive definition enables efficient handling of encapsulated content without any "searching" whatsoever; it is simply a matter of peeling through layers of context, and that occurs during the handling of any message. If that is not how your software works now, then I guarantee you that making it work that way will improve the extensibility and robustness of your software. The original Base header field was invented before Content-Location, which is probably why the new wording is confusing. Since it is reasonable to expect the base URL to be different from the location only within the innermost layer (the embedded content), it would be reasonable to eliminate the Content-Base header field from MHTML and HTTP and simply stick with the less confusing Content-Location. >I think this demonstrates, in part, why there was so much worry in the WG >about allowing recursion of these things: Base can be specified, but if >it's not specified, it's taken from the location, and if that's not >specified you take it from the base of the parent. Which, BTW, brings up an >interesting question: Let's say I have the following: > >Content-Type: multipart/related >Content-Base: foo://bar/biff/ > > Content-Type: multipart/mixed > Content-Location: blah://blee/blue.bar > > Content-Type: text/html > >What is the base for the text/html, which has neither Content-Location nor >Content-Base? Is it <blah://blee/> (the base we use for its parent since it >has not Content-Base) or is it <foo://bar/biff/> (the specific base of its >parent's parent)? It is <blah://blee/blue.bar>. I'm sorry I can't think of a better way of explaining it, but it really is a simple definition. In order for any software to read a message, it must start from the outermost layer and work its way in, just like any encapsulated data type. At each layer you have a current base URL, and at each layer that base URL may be set to something different. That resetting could be done by a Content-Base or a Content-Location, but only the first if both are present at that level. Please note that there is no way to implement a MIME content-type handler without parsing message and multipart types from the outside-in. Likewise, a valid handler for text/html must be passed a single URL to be used as the base for relative URL parsing. My specification simply matches the most reliable implementation of those handlers within user agents, and does so in a way that is independent of the innermost media type. ....Roy
Received on Thursday, 4 September 1997 03:24:34 UTC