- From: Willy Tarreau <w@1wt.eu>
- Date: Fri, 16 Jul 2010 06:55:49 +0200
- To: "Roy T. Fielding" <fielding@gbiv.com>
- Cc: Adrien de Croy <adrien@qbik.com>, HTTP Working Group <ietf-http-wg@w3.org>
On Thu, Jul 15, 2010 at 04:28:04PM -0700, Roy T. Fielding wrote: > On Jul 15, 2010, at 12:33 PM, Willy Tarreau wrote: > > It's not that simple. I have an example of a customer who uses front > > Apache reverse-proxies to perform security controls and to only > > let "clean" requests pass through. Those Apaches also add some > > headers to the requests being forwarded to the servers for logging > > purposes, and it is the only way to reach the servers. Due to that > > implementation mistake, it is possible (and I've tested) for the > > client to make the reverse proxy remove those headers it had just > > added, so that the end point finally does not get the information > > it should have unconditionally got. > > Yes, but why is that a problem? First, the process adding headers > should have already removed the Connection header received -- otherwise > it isn't doing its job. Second, even without fixing that bug, the > result is fail safe -- the proxy won't be able to forward what it > generated. It's not a problem from an HTTP point of view, the request is valid. It's a problem because some mandatory information which is unconditionally set by the proxy regardless of the client's request can still be removed by the client (eg: X-Forwarded-For, Host, X-Forwarded-Host, ...). When the next hop server takes a decision based on these info, this can become dangerous (here the main issue was not getting the client's IP in the server logs). We could for instance imagine an HTTP/1.0 compatible server which does virtual hosting and which gives access to the base dir of the virtual servers when no Host was specified. This is just an example, I'm not saying this is what happens. The issue is mostly that the client can control some aspects of the other side connection in an unexpected manner. Someone else (Adrien ?) suggested it might be problematic with caches. I'm wondering what can happen if the client does that on Accept-Language for instance. Sometimes the cache will index the headers from the real request, but since the server won't get it it may return a different version of the document, which will be cached associated with different request headers than what generated the response. Same for the Host field, eventhough it's less likely that the request will be accepted. > > Now, if we want to be fair, there are two points here which are > > causing that issue : > > > > - apache's header removal does not happen in the appropriate > > order. > > The order depends on when the module does its stuff, not on > something inherent in Apache. It is the security-checking module's > responsibility to do the removal earlier (or schedule its additions > later) if that is desired. I *think* that x-forwarded-* are managed by the mod_proxy module itself, though I may be wrong since I don't know apache well enough. Thus, I'm not sure we could fix this in the config by just moving LoadModule directives around. > > - apache is used as a reverse-proxy (and is often used that > > way) but it follows a proxy behaviour instead of a gateway > > behaviour. But I suspect that when they began, the difference > > was not clear yet between the two. > > It depends on which module is used for that purpose, but yes > the mod_proxy stuff makes for a poor gateway. The difference > was well known at the time -- I should have vetoed the reverse > proxy features when they were added (they belong in a separate > module). I agree. > It wouldn't be confusing at all if it were not for all the > extra requirements that gave types to headers (like hop-by-hop). > They make it sound like there is some header-aware engine going > through and checking the types. Yes, indeed, that's how I parse it. My understanding is that as an implementer, I should read the whole doc, and write down all header names see at least once, then eliminate from that list the ones listed as hop-by-hop. > There is no such thing in a well > written intermediary -- every decision should be based on a user > config or the self-descriptive message. In fact, in my opinion on a real proxy (I mean an outgoing proxy), it should not be much of a problem if the user can control what goes out (except maybe for what goes into the cache). I tried to imagine what could happen if the user sent a GET+content-length+ and got it removed to unveil a second request, etc... so that it could bypass some mandatory filtering, but I don't see how it could use that to gain unwanted capabilities. On the reverse proxy side it's different because what we expect from a gateway is to perform strong checks before feeding the servers with safe requests. But here the issue is not caused by what the spec says (since gateways are not concerned by the rule on the Connection header), but rather by the use of a component for the wrong job. I still just have a few doubts for the side effects on caching proxies, but I'm not skilled in this area and don't know well what part of the request may have an effect on what is really cached (eg: Accept-Language, Byte-Range, ...) Best regards, Willy
Received on Friday, 16 July 2010 04:56:32 UTC