Full-URI, again

I am now updating my list of features for known, supported HTTP servers,
and many are adding or will soon add the ability to select actions based on
various request headers. An obvious kind of action many folks like to see
would be to change the virtual root based on part or all of the full URI,
if such information was available.

The last iteration of this discussion had Roy saying he didn't feel it was
needed unless clients built it in first, and the host name was only needed
for "vanity domains". A few people responded with arguments that knowing
the full URI had other benefits, many of which I believe to be significant.
Roy didn't respond to those (unless I missed his response in the archive),
no one  objected to Chuck's statement that it was a simple addition and
let's get on with it, but Full-URI didn't appear in the -00 rev of the
spec.

I'd like to make those arguments again and lobby for the simple inclusion
of this optional header. The few words it will take in the spec will help
prevent something that almost all of us would probably want: a reduction in
broken URL links.

Here are two scenarios that weren't addressed earlier. Both involve the
possibility of broken links, something that I think we in the HTTP WG
should do our part in preventing.

Scenario 1) The departments at www.bigstate.edu differentiate their Web
hierarchies with the top-level directory names like /cs and /math.
Professors have their own directories under the respective top-levels. Many
people have URLs that point into professors hierarchies, not just at the
top level for the department. The computer science department decides to
move their hieararchy to their own machine with a different IP address and
its own domain name, cs.bigstate.edu. Unless they want all the links to
break, they must put forwarding pointers at every place in the old
hierarchy where someone might have make a bookmark. With a Full-URI header
and a smart server, the server could catch all items pointing to the
top-level /cs directory and do something smart (redirect to cs.bigstate.edu
and strip off the /cs) or an adequate thing (give a recorded message
pointing to the root of cs.bigstate.edu).

Scenario 2) A company starts a Web site on its own host system,
www.tinyco.com. It becomes financially unfeasable to keep its Internet
connection, and wants to move its Web pages to a Web service provider,
www.websrus.com, who will put it under its current hierarchy. This will
prevent all the published links to various parts of www.tinyco.com from
breaking. However, the Web provider uses the current practice of having a
top-level directory with an identifying name for each customer, such as
/tinyco. If www.tinyco.com simply redirects its domain name to the new
provider's IP address, all the links will break because they don't have the
required /tinyco lead-in directory name. On the other hand, if the
www.websrus.com server could see the incoming "www.tinyco.com", it could
slap on the proper root directory name before processing the request and
the requests would be fulfilled.

Note that some of the HTTP servers that are supporting different actions
based on headers are also supporting server-side includes. Thus, the Web
administrator could not only redirect a whole hierarchy to its proper new
home, he/she could add a header that says something like:
The location of the page<br>
http://www.bigstate.edu/cs/stein/cs-108-syllabus<br>
has moved to a new home:<br>
http://cs.bigstate.edu/stein/cs-108-syllabus.<br>
<b>Please make a note of the new location.</b> Here's that page:<hr>

Thus, I propose the following be added to the spec:

====================
Section 5.4: No change needs to be made to the introductory paragraph since
it already says that the headers "allow the client to pass additional
information about the request" and are "optional".

Add "| Full-URI" on the line after "| From".

New Section 5.4.6:

The Full-URI header field can be used to indicate the URI requested by the
user. The inclusion of this field can aid HTTP server software receiving
the request to respond in a manner that gives the user what they probably
intended to get, or at least give the server information about the URI for
logging and later analysis.

From-URI = "From-URI" ":" URI-string

The URI-string is the URI specified by the user after any encoding was
applied by the client software. Although it is not required, user agents
should always include this field with requests in order to aid server
software that has been enabled to read the field.

An example is:

From-URI: http://www.bigstate.edu/cs/stein/cs-108-syllabus
====================

I'm open to additions to this. I couldn't think of any security
implications, but I'm open to those as well.

--Paul Hoffman
--Proper Publishing

Received on Sunday, 19 March 1995 16:24:00 UTC