- From: Jian Jiang <ottojiang@gmail.com>
- Date: Thu, 5 May 2016 13:32:14 -0700
- To: ietf-http-wg@w3.org
- Cc: Chen Jianjun <whucjj@gmail.com>
- Message-ID: <CAFYAhb-JERVc-faZJ6N_hnor6a5EJnzdmeYcxiQKTnC+OfYWpg@mail.gmail.com>
Dear all, We recently found that HTTP implementations vary largely in handling host in a crafted request with multiple Host headers, and/or whitespace-preceded/-succeeded Host headers, and/or absolute request-URL. We have found some vulnerabilities due to inconsistencies between different implementations in a HTTP-processing chain. We would like to discuss this problem here. I have some initial thoughts/questions: 1). Whitespace is a major source of multiple Host ambiguity. My understanding is that only until RFC 7230, whitespace around field-name is explicitly forbidden. But the message is somewhat confusing. For whitespace between field-name and colon, the rule in RFC 7230 is clear: rejection with 400. But for whitespace before field-name, the main body of RFC 7230 (section 3) only says if whitespace appears before the first header field, either the request should be rejected or the header should be ignored. The clear rule is located at Appendix A.2: "invalid whitespace around field-names is required to be rejected ...". An uncareful read of the document would have missed this message. We have seen that implementations in general are more tolerant with whitespace-preceded Host header than whitespace-succeeded Host header. 2). Host in absolute request-URL is another major source of ambiguity. Both RFC 2616 and RFC 7230 state that host in absolute request-URL should "override" Host header. We see some implementations follow, but some don't. RFC 7230 additionally states (section 5.4) client must send a Host header that is identical with host in request-URL, which (indirectly) requires server to reject a request that has inconsistent hosts in its request-URL and header field. But only a few implement this rule. None of RFC 2616 and RFC 7230 have explicit description about scheme in request-URL. Some implementations accept any scheme like "unknown://". 3). Multiple Host header fields is explicitly forbidden in RFC 7230 (not in RFC 2616). But again only a few follow this requirement. I tried to look at the archive messages to understand why this is added in RFC 7230, but I couldn't find any discussion. Does anyone know the context around this rule ? (I found some discussions around whitespace in header field, which is very helpful) 4). My general feeling is that RFC 7230 is clear in how host should be parsed from a request. But these rules are located in different places, quite easy to miss when doing implementation. Any comments/thoughts welcome. Best regards, Jian Jiang
Received on Friday, 6 May 2016 02:11:59 UTC