- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Tue, 27 Jul 2010 20:52:39 -0700
- To: HTTP Working Group <ietf-http-wg@w3.org>
As part of the changes for draft 11, I merged the misnamed section on
message length into the message body section and then rewrote the
steps for determining the message body length to remove the
ambiguities noted previously in tickets #28, #90, and #95.
The primary additions are requirements on how to handle messages
with multiple or invalid content-length values, or both
transfer-encoding and content-length. Also, multipart/byteranges
has been removed as a length-determinator.
It is probably easier to read in plain text than as a diff, so here
it is for your review:
==========
3.3. Message Body
The message-body (if any) of an HTTP message is used to carry the
payload body associated with the request or response.
message-body = *OCTET
The message-body differs from the payload body only when a transfer-
coding has been applied, as indicated by the Transfer-Encoding header
field (Section 9.7). When one or more transfer-codings are applied
to a payload in order to form the message-body, the Transfer-Encoding
header field MUST contain the list of transfer-codings applied.
Transfer-Encoding is a property of the message, not of the payload,
and thus MAY be added or removed by any implementation along the
request/response chain under the constraints found in Section 6.2.
The rules for when a message-body is allowed in a message differ for
requests and responses.
The presence of a message-body in a request is signaled by the
inclusion of a Content-Length or Transfer-Encoding header field in
the request's header fields, even if the request method does not
define any use for a message-body. This allows the request message
framing algorithm to be independent of method semantics.
For response messages, whether or not a message-body is included with
a message is dependent on both the request method and the response
status code (Section 5.1.1). Responses to the HEAD request method
never include a message-body because the associated response header
fields (e.g., Transfer-Encoding, Content-Length, etc.) only indicate
what their values would have been if the method had been GET. All
1xx (Informational), 204 (No Content), and 304 (Not Modified)
responses MUST NOT include a message-body. All other responses do
include a message-body, although the body MAY be of zero length.
The length of the message-body is determined by one of the following
(in order of precedence):
1. Any response to a HEAD request and any response with a status
code of 100-199, 204, or 304 is always terminated by the first
empty line after the header fields, regardless of the header
fields present in the message, and thus cannot contain a message-
body.
2. If a Transfer-Encoding header field (Section 9.7) is present and
the "chunked" transfer-coding (Section 6.2) is the final
encoding, the message-body length is determined by reading and
decoding the chunked data until the transfer-coding indicates the
data is complete.
If a Transfer-Encoding header field is present in a response and
the "chunked" transfer-coding is not the final encoding, the
message-body length is determined by reading the connection until
it is closed by the server. If a Transfer-Encoding header field
is present in a request and the "chunked" transfer-coding is not
the final encoding, the message-body length cannot be determined
reliably; the server MUST respond with the 400 (Bad Request)
status code and then close the connection.
If a message is received with both a Transfer-Encoding header
field and a Content-Length header field, the Transfer-Encoding
overrides the Content-Length. Such a message might indicate an
attempt to perform request or response smuggling (bypass of
security-related checks on message routing or content) and thus
should be handled as an error. The provided Content-Length MUST
be removed, prior to forwarding the message downstream, or
replaced with the real message-body length after the transfer-
coding is decoded.
3. If a message is received without Transfer-Encoding and with
either multiple Content-Length header fields or a single Content-
Length header field with an invalid value, then the message
framing is invalid and MUST be treated as an error to prevent
request or response smuggling. If this is a request message, the
server MUST respond with a 400 (Bad Request) status code and then
close the connection. If this is a response message received by
a proxy or gateway, the proxy or gateway MUST discard the
received response, send a 502 (Bad Gateway) status code as its
downstream response, and then close the connection. If this is a
response message received by a user-agent, the message-body
length is determined by reading the connection until it is
closed; an error SHOULD be indicated to the user.
4. If a valid Content-Length header field (Section 9.2) is present
without Transfer-Encoding, its decimal value defines the message-
body length in octets. If the actual number of octets sent in
the message is less than the indicated Content-Length, the
recipient MUST consider the message to be incomplete and treat
the connection as no longer usable. If the actual number of
octets sent in the message is more than the indicated Content-
Length, the recipient MUST only process the message-body up to
the field value's number of octets; the remainder of the message
MUST either be discarded or treated as the next message in a
pipeline. For the sake of robustness, a user-agent MAY attempt
to detect and correct such an error in message framing if it is
parsing the response to the last request on on a connection and
the connection has been closed by the server.
5. If this is a request message and none of the above are true, then
the message-body length is zero (no message-body is present).
6. Otherwise, this is a response message without a declared message-
body length, so the message-body length is determined by the
number of octets received prior to the server closing the
connection.
Since there is no way to distinguish a successfully completed, close-
delimited message from a partially-received message interrupted by
network failure, implementations SHOULD use encoding or length-
delimited messages whenever possible. The close-delimiting feature
exists primarily for backwards compatibility with HTTP/1.0.
A server MAY reject a request that contains a message-body but not a
Content-Length by responding with 411 (Length Required).
Unless a transfer-coding other than "chunked" has been applied, a
client that sends a request containing a message-body SHOULD use a
valid Content-Length header field if the message-body length is known
in advance, rather than the "chunked" encoding, since some existing
services respond to "chunked" with a 411 (Length Required) status
code even though they understand the chunked encoding. This is
typically because such services are implemented via a gateway that
requires a content-length in advance of being called and the server
is unable or unwilling to buffer the entire request before
processing.
A client that sends a request containing a message-body MUST include
a valid Content-Length header field if it does not know the server
will handle HTTP/1.1 (or later) requests; such knowledge can be in
the form of specific user configuration or by remembering the version
of a prior received response.
Request messages that are prematurely terminated, possibly due to a
cancelled connection or a server-imposed time-out exception, MUST
result in closure of the connection; sending an HTTP/1.1 error
response prior to closing the connection is OPTIONAL. Response
messages that are prematurely terminated, usually by closure of the
connection prior to receiving the expected number of octets or by
failure to decode a transfer-encoded message-body, MUST be recorded
as incomplete. A user agent MUST NOT render an incomplete response
message-body as if it were complete (i.e., some indication must be
given to the user that an error occurred). Cache requirements for
incomplete responses are defined in Section 2.1.1 of [Part6].
A server MUST read the entire request message-body or close the
connection after sending its response, since otherwise the remaining
data on a persistent connection would be misinterpreted as the next
request. Likewise, a client MUST read the entire response message-
body if it intends to reuse the same connection for a subsequent
request. Pipelining multiple requests on a connection is described
in Section 7.1.2.2.
==========
....Roy
Received on Wednesday, 28 July 2010 03:53:13 UTC