Re: Recursive look up of base in outer headers

Andy Jacobs (andyj@Exchange.MICROSOFT.com)
Thu, 4 Sep 1997 10:05:40 -0700


Message-ID: <2FBF98FC7852CF11912A000000000001050F6D89@DINO>
From: "Andy Jacobs (Exchange)" <andyj@Exchange.MICROSOFT.com>
To: "'Roy T. Fielding'" <fielding@kiwi.ics.uci.edu>, mhtml@segate.sunet.se,
Subject: RE: Recursive look up of base in outer headers
Date: Thu, 4 Sep 1997 10:05:40 -0700 

> Content-Base takes precedence over Content-Location within the same
> header field set because that is the only reason why Content-Base
> exists --- to provide a way to say that the embedded links are
> relative to something other than the document location.

Content-Base is also used to modify the meaning of Content-Location
(when Location is relative).  This is a conclusion from the spec which
indicates that Base is not needed if Location is an absolute URL, and
later when resolving links from an HTML body part to other body parts
(if the referenced body part has a relative Location and a Base, they
are combined).

When I first read the spec, I had thought that modifying the Location
was the only purpose of a Base (for the above reasons, and because I
read the spec from top to bottom and this purpose was mentioned first).
----
 - Andy Jacobs
   andyj@microsoft.com


-----Original Message-----
From:	Roy T. Fielding [SMTP:fielding@kiwi.ics.uci.edu]
Sent:	Thursday, September 04, 1997 12:15 AM
To:	mhtml@segate.sunet.se; uri@bunyip.com
Subject:	Re: Recursive look up of base in outer headers

I must say that this whole discussion is a bit weird to me.
Content-Base takes precedence over Content-Location within the same
header field set because that is the only reason why Content-Base
exists --- to provide a way to say that the embedded links are
relative to something other than the document location.

Unfortunately, I can't see the confusion because I wrote the words.
Given the choice of removing the recursive definition or removing
Content-Base, I would remove Content-Base without a second thought.

The recursive definition enables efficient handling of encapsulated
content without any "searching" whatsoever; it is simply a matter of
peeling through layers of context, and that occurs during the handling
of any message.  If that is not how your software works now, then I
guarantee you that making it work that way will improve the
extensibility
and robustness of your software.

The original Base header field was invented before Content-Location,
which is probably why the new wording is confusing.  Since it is
reasonable to expect the base URL to be different from the location
only within the innermost layer (the embedded content), it would be
reasonable to eliminate the Content-Base header field from MHTML and
HTTP and simply stick with the less confusing Content-Location.

>I think this demonstrates, in part, why there was so much worry in the
WG
>about allowing recursion of these things: Base can be specified, but if
>it's not specified, it's taken from the location, and if that's not
>specified you take it from the base of the parent. Which, BTW, brings
up an
>interesting question: Let's say I have the following:
>
>Content-Type: multipart/related
>Content-Base: foo://bar/biff/
>
>    Content-Type: multipart/mixed
>    Content-Location: blah://blee/blue.bar
>
>        Content-Type: text/html
>
>What is the base for the text/html, which has neither Content-Location
nor
>Content-Base? Is it <blah://blee/> (the base we use for its parent
since it
>has not Content-Base) or is it <foo://bar/biff/> (the specific base of
its
>parent's parent)?

It is <blah://blee/blue.bar>.  I'm sorry I can't think of a better way
of explaining it, but it really is a simple definition.  In order for
any
software to read a message, it must start from the outermost layer and
work its way in, just like any encapsulated data type.  At each layer
you have a current base URL, and at each layer that base URL may be
set to something different.  That resetting could be done by a
Content-Base or a Content-Location, but only the first if both are
present at that level.

Please note that there is no way to implement a MIME content-type
handler
without parsing message and multipart types from the outside-in.
Likewise,
a valid handler for text/html must be passed a single URL to be used as
the base for relative URL parsing.  My specification simply matches
the most reliable implementation of those handlers within user agents,
and does so in a way that is independent of the innermost media type.

....Roy