On Wed, Jan 17, 2007 at 11:47:33AM +0100, Henrik Nordstrom wrote:
<snip>
> Does the implied LWS rule apply to header names, even if it's not
> allowed in MIME? Allowing LWS around the header name does not make
> sense, but it is not explicitly forbidden.
LWS is not allowed.
<snip>
> Content-Length : 100
BNF makes it clear.
token = 1*<any CHAR except CTLs or separators>
message-header = field-name ":" [ field-value ]
field-name = token
separator = [...] | SP | HT
What we have there is malformed; separators (SP) are not allowed in tokens.
<snip>
> Content-Length
> :100
What we have here is malformed; CTLs (CR/LF) are not allowed in tokens.
<snip>
> And how SHOULD a recipient react if there is multiple Content-Length
> headers?
The same way it MUST react whenever it gets any header that isn't allowed
multiples -- reject the message as blatantly malformed:
Multiple message-header fields with the same field-name MAY be
present in a message if and only if the entire field-value for that
header field is defined as a comma-separated list [i.e., #(values)].
It MUST be possible to combine the multiple header fields into one
"field-name: field-value" pair, without changing the semantics of the
message, by appending each subsequent field-value to the first, each
separated by a comma.
(Section 4.2, page 32)
> These aspects is critical for defining the message delimiting within the
> protocol, and still the specs is quite silent on the details which calls
> for the kind of interoperability problems shown in the report.
If I may insert my two cents here: the spec is so fuzzy, poor, and inconsistent
in OTHER places that it affects the overall quality and ability to implement.
In my experience, it got _very_ tempting to just start ignoring the spec and do
a pigeon HTTP/1.1 implementation. Here, I think things are actually pretty
clear and obvious, except for the CR-CR-LF possibility (parsing LWS has been a
very obnoxious thorn in my side) -- but the implementer has to still be paying
attention to catch all the details.
--
Travis