Re: Multiple Host Ambiguity

> On 5 May 2016, at 21:32, Jian Jiang <ottojiang@gmail.com> wrote:
> 
> Dear all,
> 
> We recently found that HTTP implementations vary largely in handling host in a crafted request with multiple Host headers, and/or whitespace-preceded/-succeeded Host headers, and/or absolute request-URL. We have found some vulnerabilities due to inconsistencies between different implementations in a HTTP-processing chain.
> 
> We would like to discuss this problem here. I have some initial thoughts/questions:
> 
> 1). Whitespace is a major source of multiple Host ambiguity. My understanding is that only until RFC 7230, whitespace around field-name is explicitly forbidden. But the message is somewhat confusing. For whitespace between field-name and colon, the rule in RFC 7230 is clear: rejection with 400. But for whitespace before field-name, the main body of RFC 7230 (section 3) only says if whitespace appears before the first header field, either the request should be rejected or the header should be ignored. The clear rule is located at Appendix A.2: "invalid whitespace around field-names is required to be rejected ...". An uncareful read of the document would have missed this message. We have seen that implementations in general are more tolerant with whitespace-preceded Host header than whitespace-succeeded Host header.

Does RFC 7230 say that whitespace before the first header field should be rejected or ignored? All I can find in Section 3 is that RFC 7230 is about ons-fold, which cannot apply to the first header field because obs-fold is grammatically defined to trail the field-value rather than to lead the field-name.

I’d say, then, that a space preceding the Host header should lead to one of two behaviours: either rejecting the request if that’s the first header field (it’s ill-formed), rejecting the request if the implementation rejects obs-fold, or folding it into the preceding header field (leading the implementation to conclude that no Host header is present).

Note that Appendix A.2 is non-normative: it simply notes the changes that were made. In this case, that appendix note applies to the requirement to 400 invalid whitespace between a field name and the colon.

Did you confirm that implementations treated the whitespace-preceded Host header as host header, rather than folding it into the prior header? If they did that, I’d say those implementations were being over-generous with their parsing.

> 2). Host in absolute request-URL is another major source of ambiguity. Both RFC 2616 and RFC 7230 state that host in absolute request-URL should "override" Host header. We see some implementations follow, but some don't. RFC 7230 additionally states (section 5.4) client must send a Host header that is identical with host in request-URL, which (indirectly) requires server to reject a request that has inconsistent hosts in its request-URL and header field. But only a few implement this rule. None of RFC 2616 and RFC 7230 have explicit description about scheme in request-URL. Some implementations accept any scheme like "unknown://“.

This seems unambiguous to me: if the Host header and authority portion of the request URL conflict, that’s a client error that needs a 4XX response.

> 3). Multiple Host header fields is explicitly forbidden in RFC 7230 (not in RFC 2616). But again only a few follow this requirement. I tried to look at the archive messages to understand why this is added in RFC 7230, but I couldn't find any discussion. Does anyone know the context around this rule ? (I found some discussions around whitespace in header field, which is very helpful)

Multiple Host header fields were implicitly forbidden in RFC 2616 by the definition of the Host header field (‘host [ “:” port]’) combined with RFC 2616 Section 4.2’s text that says that "Multiple message-header fields with the same field-name MAY be present in a message if and only if the entire field-value for that header field is defined as a comma-separated list.”. That iff criterion doesn’t apply to Host, so multiple Host headers were forbidden implicitly by that requirement.

The change in RFC 7230 is therefore editorial only and no discussion would have been required: it made explicit a requirement that was already implicit in the rest of the text.

It is not uncommon for implementations to be *somewhat* lenient on this rule in some cases: most generally, in cases with Host where the duplicate header fields are identical. That said, RFC 7230 is now much clearer about the requirement to 4XX.

> 4). My general feeling is that RFC 7230 is clear in how host should be parsed from a request. But these rules are located in different places, quite easy to miss when doing implementation.

I don’t know that I agree with this concern.

There are no special rules for parsing the Host header field from a request header block: it is parsed exactly like all other headers. Anyone implementing a HTTP/1.1 implementation has to know the rules for parsing header fields, and they do not need to special-case Host.

The only extra wrinkle is around the handling of hosts in absolute-URIs in the request line, and the rule there is exactly where I’d expect to see it: in the text about absolute-URIs in the request line. There doesn’t seem to be much ambiguity here.

Cory

Received on Friday, 6 May 2016 10:27:29 UTC