- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Tue, 27 Jul 2010 20:52:39 -0700
- To: HTTP Working Group <ietf-http-wg@w3.org>
As part of the changes for draft 11, I merged the misnamed section on message length into the message body section and then rewrote the steps for determining the message body length to remove the ambiguities noted previously in tickets #28, #90, and #95. The primary additions are requirements on how to handle messages with multiple or invalid content-length values, or both transfer-encoding and content-length. Also, multipart/byteranges has been removed as a length-determinator. It is probably easier to read in plain text than as a diff, so here it is for your review: ========== 3.3. Message Body The message-body (if any) of an HTTP message is used to carry the payload body associated with the request or response. message-body = *OCTET The message-body differs from the payload body only when a transfer- coding has been applied, as indicated by the Transfer-Encoding header field (Section 9.7). When one or more transfer-codings are applied to a payload in order to form the message-body, the Transfer-Encoding header field MUST contain the list of transfer-codings applied. Transfer-Encoding is a property of the message, not of the payload, and thus MAY be added or removed by any implementation along the request/response chain under the constraints found in Section 6.2. The rules for when a message-body is allowed in a message differ for requests and responses. The presence of a message-body in a request is signaled by the inclusion of a Content-Length or Transfer-Encoding header field in the request's header fields, even if the request method does not define any use for a message-body. This allows the request message framing algorithm to be independent of method semantics. For response messages, whether or not a message-body is included with a message is dependent on both the request method and the response status code (Section 5.1.1). Responses to the HEAD request method never include a message-body because the associated response header fields (e.g., Transfer-Encoding, Content-Length, etc.) only indicate what their values would have been if the method had been GET. All 1xx (Informational), 204 (No Content), and 304 (Not Modified) responses MUST NOT include a message-body. All other responses do include a message-body, although the body MAY be of zero length. The length of the message-body is determined by one of the following (in order of precedence): 1. Any response to a HEAD request and any response with a status code of 100-199, 204, or 304 is always terminated by the first empty line after the header fields, regardless of the header fields present in the message, and thus cannot contain a message- body. 2. If a Transfer-Encoding header field (Section 9.7) is present and the "chunked" transfer-coding (Section 6.2) is the final encoding, the message-body length is determined by reading and decoding the chunked data until the transfer-coding indicates the data is complete. If a Transfer-Encoding header field is present in a response and the "chunked" transfer-coding is not the final encoding, the message-body length is determined by reading the connection until it is closed by the server. If a Transfer-Encoding header field is present in a request and the "chunked" transfer-coding is not the final encoding, the message-body length cannot be determined reliably; the server MUST respond with the 400 (Bad Request) status code and then close the connection. If a message is received with both a Transfer-Encoding header field and a Content-Length header field, the Transfer-Encoding overrides the Content-Length. Such a message might indicate an attempt to perform request or response smuggling (bypass of security-related checks on message routing or content) and thus should be handled as an error. The provided Content-Length MUST be removed, prior to forwarding the message downstream, or replaced with the real message-body length after the transfer- coding is decoded. 3. If a message is received without Transfer-Encoding and with either multiple Content-Length header fields or a single Content- Length header field with an invalid value, then the message framing is invalid and MUST be treated as an error to prevent request or response smuggling. If this is a request message, the server MUST respond with a 400 (Bad Request) status code and then close the connection. If this is a response message received by a proxy or gateway, the proxy or gateway MUST discard the received response, send a 502 (Bad Gateway) status code as its downstream response, and then close the connection. If this is a response message received by a user-agent, the message-body length is determined by reading the connection until it is closed; an error SHOULD be indicated to the user. 4. If a valid Content-Length header field (Section 9.2) is present without Transfer-Encoding, its decimal value defines the message- body length in octets. If the actual number of octets sent in the message is less than the indicated Content-Length, the recipient MUST consider the message to be incomplete and treat the connection as no longer usable. If the actual number of octets sent in the message is more than the indicated Content- Length, the recipient MUST only process the message-body up to the field value's number of octets; the remainder of the message MUST either be discarded or treated as the next message in a pipeline. For the sake of robustness, a user-agent MAY attempt to detect and correct such an error in message framing if it is parsing the response to the last request on on a connection and the connection has been closed by the server. 5. If this is a request message and none of the above are true, then the message-body length is zero (no message-body is present). 6. Otherwise, this is a response message without a declared message- body length, so the message-body length is determined by the number of octets received prior to the server closing the connection. Since there is no way to distinguish a successfully completed, close- delimited message from a partially-received message interrupted by network failure, implementations SHOULD use encoding or length- delimited messages whenever possible. The close-delimiting feature exists primarily for backwards compatibility with HTTP/1.0. A server MAY reject a request that contains a message-body but not a Content-Length by responding with 411 (Length Required). Unless a transfer-coding other than "chunked" has been applied, a client that sends a request containing a message-body SHOULD use a valid Content-Length header field if the message-body length is known in advance, rather than the "chunked" encoding, since some existing services respond to "chunked" with a 411 (Length Required) status code even though they understand the chunked encoding. This is typically because such services are implemented via a gateway that requires a content-length in advance of being called and the server is unable or unwilling to buffer the entire request before processing. A client that sends a request containing a message-body MUST include a valid Content-Length header field if it does not know the server will handle HTTP/1.1 (or later) requests; such knowledge can be in the form of specific user configuration or by remembering the version of a prior received response. Request messages that are prematurely terminated, possibly due to a cancelled connection or a server-imposed time-out exception, MUST result in closure of the connection; sending an HTTP/1.1 error response prior to closing the connection is OPTIONAL. Response messages that are prematurely terminated, usually by closure of the connection prior to receiving the expected number of octets or by failure to decode a transfer-encoded message-body, MUST be recorded as incomplete. A user agent MUST NOT render an incomplete response message-body as if it were complete (i.e., some indication must be given to the user that an error occurred). Cache requirements for incomplete responses are defined in Section 2.1.1 of [Part6]. A server MUST read the entire request message-body or close the connection after sending its response, since otherwise the remaining data on a persistent connection would be misinterpreted as the next request. Likewise, a client MUST read the entire response message- body if it intends to reuse the same connection for a subsequent request. Pipelining multiple requests on a connection is described in Section 7.1.2.2. ========== ....Roy
Received on Wednesday, 28 July 2010 03:53:13 UTC