Multiple Host Ambiguity

Dear all,

We recently found that HTTP implementations vary largely in handling host
in a crafted request with multiple Host headers, and/or
whitespace-preceded/-succeeded Host headers, and/or absolute request-URL.
We have found some vulnerabilities due to inconsistencies between different
implementations in a HTTP-processing chain.

We would like to discuss this problem here. I have some initial
thoughts/questions:

1). Whitespace is a major source of multiple Host ambiguity. My
understanding is that only until RFC 7230, whitespace around field-name is
explicitly forbidden. But the message is somewhat confusing. For whitespace
between field-name and colon, the rule in RFC 7230 is clear: rejection with
400. But for whitespace before field-name, the main body of RFC 7230
(section 3) only says if whitespace appears before the first header field,
either the request should be rejected or the header should be ignored. The
clear rule is located at Appendix A.2: "invalid whitespace around
field-names is required to be rejected ...". An uncareful read of the
document would have missed this message. We have seen that implementations
in general are more tolerant with whitespace-preceded Host header than
whitespace-succeeded Host header.

2). Host in absolute request-URL is another major source of ambiguity. Both
RFC 2616 and RFC 7230 state that host in absolute request-URL should
"override" Host header. We see some implementations follow, but some don't.
RFC 7230 additionally states (section 5.4) client must send a Host header
that is identical with host in request-URL, which (indirectly) requires
server to reject a request that has inconsistent hosts in its request-URL
and header field. But only a few implement this rule. None of RFC 2616 and
RFC 7230 have explicit description about scheme in request-URL. Some
implementations accept any scheme like "unknown://".

3). Multiple Host header fields is explicitly forbidden in RFC 7230 (not in
RFC 2616). But again only a few follow this requirement. I tried to look at
the archive messages to understand why this is added in RFC 7230, but I
couldn't find any discussion. Does anyone know the context around this rule
? (I found some discussions around whitespace in header field, which is
very helpful)

4). My general feeling is that RFC 7230 is clear in how host should be
parsed from a request. But these rules are located in different places,
quite easy to miss when doing implementation.

Any comments/thoughts welcome.


Best regards,
Jian Jiang

Received on Friday, 6 May 2016 02:11:59 UTC