- From: Koen Holtman <koen@win.tue.nl>
- Date: Sun, 28 Apr 1996 15:01:53 +0200 (MET DST)
- To: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
- Cc: Koen Holtman <koen@win.tue.nl>
[Note for Jim Gettys: please perform the edits indicated below as soon
as you can.]
Roy T. Fielding:
>[Dave Kristol:]
>> Okay, I'll accept your arguments. However, there is still a piece I feel
>> is missing:
>>
>> 1) We want clients to send, and servers to accept, an absoluteURI.
>
>No, we only want servers to accept an absoluteURI. Clients can't
>send it until 2.0.
>
>> 2) Eventually we want to obviate the need for Host:.
>
>Not in 1.x. The stuff added to the third block of text in section 5.1.2
>is wrong -- it needs to be removed.
I include a corrected 5.1.2 below.
>> 3) The purpose of absoluteURI and/or Host is to specify the virtual host
>> for (at least) an http_URL.
>
>That is only one of many things that absoluteURI is used for in the
>specification. In fact, it is probably the least important of them.
>
>> The reason I've been trying to restrict the absoluteURI syntax is that
>> then it becomes easy to identify the virtual host. I believe somewhere
>> in the spec. there should be an explanation of how to identify the
>> virtual host of "the resource identified by the request".
>
>I believe Koen asked for such a section of text to be added to the
>section on Requests. However, that must not change the protocol syntax.
>
>(and it ain't gunna write itself)
Also included below.
>......Roy
------------
Corrected 5.1.2, with computer-generated change bars:
5.1.2 Request-URI
The Request-URI is a Uniform Resource Identifier (Section 3.2) and
identifies the resource upon which to apply the request.
Request-URI = "*" | absoluteURI | abs_path
To allow for transition to absoluteURIs in all requests in future
| versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI
| form in requests, even though HTTP/1.1 clients will only generate
| them in requests to proxies. Versions of HTTP after HTTP/1.1 may
require absoluteURIs everywhere, after HTTP/1.1 or later have become
the dominant implementations. The three options for Request-URI are
dependent on the nature of the request. The asterisk _*_ means that
the request does not apply to a particular resource, but to the
server itself, and is only allowed when the Method used does not
necessarily apply to a resource. One example would be
OPTIONS * HTTP/1.1
| [##first sentence deleted##] If the absoluteURI form is used, any
Host request-header included with the request MUST be ignored. The
absoluteURI form is required when the request is being made to a
proxy. The proxy is requested to forward the request and return the
| response, or to serve an response from cache according to the rules
| for caching. Note that the proxy MAY forward the request on to
another proxy or directly to the server specified by the
absoluteURI. In order to avoid request loops, a proxy MUST be able
to recognize all of its server names, including any aliases, local
variations, and the numeric IP address. An example Request-Line
would be:
GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1
The most common form of Request-URI is that used to identify a
| resource on an origin server or gateway. In this case, the absolute
| path of the URI MUST be transmitted (see Section 3.2.1,
abs_path). For example, a client wishing to retrieve the resource
above directly from the origin server would create a TCP connection
to port 80 of the host _www.w3.org_ and send the lines:
GET /pub/WWW/TheProject.html HTTP/1.1
Host:www.w3.org
followed by the remainder of the Full-Request. Note that the absolute
path cannot be empty; if none is present in the original URI, it MUST be
given as _/_ (the server root).
If a proxy receives a request without any path in the Request-URI and
the method used is capable of supporting the asterisk form of request,
then the last proxy on the request chain MUST forward the request with
_*_ as the final Request-URI. For example, the request
OPTIONS http://www.ics.uci.edu:8001 HTTP/1.1
would be forwarded by the proxy as
OPTIONS * HTTP/1.1
after connecting to port 8001 of host "www.ics.uci.edu".
The Request-URI is transmitted as an encoded string, where some
characters may be escaped using the _% HEX HEX_ encoding defined by
RFC 1738 [4]. The origin server MUST decode the Request-URI in order
to properly interpret the request. In requests that they forward,
proxies MUST NOT rewrite the _abs_path_ part of a Request-URI in any
way except as noted above to replace a null abs_path with
_*_. Illegal Request-URIs SHOULD be responded to with an appropriate
status code. (Proxies MAY transform the Request-URI for internal
processing purposes, but SHOULD NOT send such a transformed
| Request-URI in forwarded requests. [##sentence deleted at this
| point##] The main reason for this rule is to make sure that the form
of Request-URIs is well specified, to enable future extensions
without fear that they will break in the face of some
rewritings. Another is that one consequence of rewriting the
Request-URI is that integrity or authentication checks by the server
may fail; since rewriting MUST be avoided in this case, it may as
well be proscribed in general.
Note: servers writers SHOULD be aware that some existing proxies
do some rewriting.
-----------
Text for the new section 5.3:
5.3 The resource identified by a request
HTTP/1.1 origin servers MUST compute the URI of the resource
identified by a request in the following way.
1. If Request-URI is an absoluteURI, the URI of the identified
resource is the decoded absoluteURI. Any Host header in the
request is ignored.
2. If the request-URI is an abs_path, and the request includes a
Host header field, the URI of the identified resource is
"http://" host-value decoded-abs_path
where host-value is the value in the host header field, and
decoded-abs_path the decoded abs_path value.
3. If the request-URI is an abs_path, and the request does not
include a Host header field (this may be the case in requests
from HTTP/1.0 clients), the server MUST either respond with an
error message, or take the URI of the identified resource to be
"http://" heuristic-host-value decoded-abs_path
where some heuristic is used to compute the
heuristic-host-value. If the network address of a HTTP/1.1
server is bound to only one Internet host domain name, the
server can always use that Internet host domain name for the
heuristic-host-value.
--------------------------
Koen.
Received on Sunday, 28 April 1996 06:08:50 UTC