EDITS for Section 5 (Request) (Was: Re: 5.1.2 Request-URI)

[Note for Jim Gettys: please perform the edits indicated below as soon
as you can.]

Roy T. Fielding:
>[Dave Kristol:]
>> Okay, I'll accept your arguments.  However, there is still a piece I feel
>> is missing:
>> 
>> 1) We want clients to send, and servers to accept, an absoluteURI.
>
>No, we only want servers to accept an absoluteURI.  Clients can't
>send it until 2.0.
>
>> 2) Eventually we want to obviate the need for Host:.
>
>Not in 1.x.  The stuff added to the third block of text in section 5.1.2
>is wrong -- it needs to be removed.

I include a corrected 5.1.2 below.

>> 3) The purpose of absoluteURI and/or Host is to specify the virtual host
>> for (at least) an http_URL.
>
>That is only one of many things that absoluteURI is used for in the
>specification.  In fact, it is probably the least important of them.
>
>> The reason I've been trying to restrict the absoluteURI syntax is that
>> then it becomes easy to identify the virtual host.  I believe somewhere
>> in the spec. there should be an explanation of how to identify the
>> virtual host of "the resource identified by the request".
>
>I believe Koen asked for such a section of text to be added to the
>section on Requests.  However, that must not change the protocol syntax.
>
>(and it ain't gunna write itself)

Also included below.

>......Roy

------------

Corrected 5.1.2, with computer-generated change bars:


    5.1.2 Request-URI

    The Request-URI is a Uniform Resource Identifier (Section 3.2) and
    identifies the resource upon which to apply the request.

           Request-URI    = "*" | absoluteURI | abs_path

    To allow for transition to absoluteURIs in all requests in future
|   versions of HTTP, all HTTP/1.1 servers MUST accept the absoluteURI
|   form in requests, even though HTTP/1.1 clients will only generate
|   them in requests to proxies.  Versions of HTTP after HTTP/1.1 may
    require absoluteURIs everywhere, after HTTP/1.1 or later have become
    the dominant implementations. The three options for Request-URI are
    dependent on the nature of the request. The asterisk _*_ means that
    the request does not apply to a particular resource, but to the
    server itself, and is only allowed when the Method used does not
    necessarily apply to a resource. One example would be

           OPTIONS * HTTP/1.1

|   [##first sentence deleted##] If the absoluteURI form is used, any
    Host request-header included with the request MUST be ignored.  The
    absoluteURI form is required when the request is being made to a
    proxy. The proxy is requested to forward the request and return the
|   response, or to serve an response from cache according to the rules
|   for caching. Note that the proxy MAY forward the request on to
    another proxy or directly to the server specified by the
    absoluteURI. In order to avoid request loops, a proxy MUST be able
    to recognize all of its server names, including any aliases, local
    variations, and the numeric IP address. An example Request-Line
    would be:

           GET http://www.w3.org/pub/WWW/TheProject.html HTTP/1.1

    The most common form of Request-URI is that used to identify a
|   resource on an origin server or gateway. In this case, the absolute
|   path of the URI MUST be transmitted (see Section 3.2.1,
    abs_path). For example, a client wishing to retrieve the resource
    above directly from the origin server would create a TCP connection
    to port 80 of the host _www.w3.org_ and send the lines:

           GET /pub/WWW/TheProject.html HTTP/1.1
           Host:www.w3.org

    followed by the remainder of the Full-Request. Note that the absolute
    path cannot be empty; if none is present in the original URI, it MUST be
    given as _/_ (the server root).

    If a proxy receives a request without any path in the Request-URI and
    the method used is capable of supporting the asterisk form of request,
    then the last proxy on the request chain MUST forward the request with
    _*_ as the final Request-URI. For example, the request

           OPTIONS http://www.ics.uci.edu:8001 HTTP/1.1

    would be forwarded by the proxy as

           OPTIONS * HTTP/1.1

    after connecting to port 8001 of host "www.ics.uci.edu".

    The Request-URI is transmitted as an encoded string, where some
    characters may be escaped using the _% HEX HEX_ encoding defined by
    RFC 1738 [4]. The origin server MUST decode the Request-URI in order
    to properly interpret the request.  In requests that they forward,
    proxies MUST NOT rewrite the _abs_path_ part of a Request-URI in any
    way except as noted above to replace a null abs_path with
    _*_. Illegal Request-URIs SHOULD be responded to with an appropriate
    status code.  (Proxies MAY transform the Request-URI for internal
    processing purposes, but SHOULD NOT send such a transformed
|   Request-URI in forwarded requests.  [##sentence deleted at this
|   point##] The main reason for this rule is to make sure that the form
    of Request-URIs is well specified, to enable future extensions
    without fear that they will break in the face of some
    rewritings. Another is that one consequence of rewriting the
    Request-URI is that integrity or authentication checks by the server
    may fail; since rewriting MUST be avoided in this case, it may as
    well be proscribed in general.

      Note: servers writers SHOULD be aware that some existing proxies
      do some rewriting.


-----------

Text for the new section 5.3:

  5.3 The resource identified by a request

  HTTP/1.1 origin servers MUST compute the URI of the resource
  identified by a request in the following way.

   1. If Request-URI is an absoluteURI, the URI of the identified
      resource is the decoded absoluteURI.  Any Host header in the
      request is ignored.

   2. If the request-URI is an abs_path, and the request includes a
      Host header field, the URI of the identified resource is

        "http://" host-value decoded-abs_path

      where host-value is the value in the host header field, and
      decoded-abs_path the decoded abs_path value.

   3. If the request-URI is an abs_path, and the request does not
      include a Host header field (this may be the case in requests
      from HTTP/1.0 clients), the server MUST either respond with an
      error message, or take the URI of the identified resource to be

        "http://" heuristic-host-value decoded-abs_path

      where some heuristic is used to compute the
      heuristic-host-value.  If the network address of a HTTP/1.1
      server is bound to only one Internet host domain name, the
      server can always use that Internet host domain name for the
      heuristic-host-value.


--------------------------

Koen.

Received on Sunday, 28 April 1996 06:08:50 UTC