- From: Adam Barth <w3c@adambarth.com>
- Date: Sun, 25 Jan 2009 11:31:12 -0800
- To: Mark Nottingham <mnot@mnot.net>
- Cc: "Roy T. Fielding" <fielding@gbiv.com>, Larry Masinter <LMM@acm.org>, ietf-http-wg@w3.org, Lisa Dusseault <ldusseault@commerce.net>
On Sat, Jan 24, 2009 at 8:30 PM, Mark Nottingham <mnot@mnot.net> wrote: > I'd like to dig into that. You believe that most of the suppression of the > Referer header is done in proxies, due to the differences seen in HTTP and > HTTPS. That and the low suppression rates of document.referrer (see Figure 3 of http://www.adambarth.com/papers/2008/barth-jackson-mitchell-b.pdf). > However, there are also considerable differences between the block > rates for same-domain vs. cross-domain requests; are you implying that these > proxies are parsing the Referer and only blocking those that are > cross-domain? That's what appears to be going on. > If so, this seems an odd rationale; a person or company > blocking referers for purpose of privacy would presumably be doing so for > all values, not just cross-domain referers. Same-domain Referer headers do raise a privacy concern because the server already knows which URL the user requested previously. Cross-domain Referer headers can inform an entirely different entity of what page you were just viewing. > Likewise, someone doing it to > hide intranet URLs would be more likely to only hide those, rather than to > stop cross-domain referers. We were unable to measure this in our experiment because we ran our experiment on the Internet. > Additionally, discriminating requests as cross-domain is more expensive to > implement in an intermediary, and these implementers are famously sensitive > to performance issues. This can be done with a simple regular expression by matching the Host header with the Referer header. > All of the products that I'm aware of would easily > allow wholesale blocking of a header, but would require a relatively > expensive (and thereby less likely) callout (e.g., with ICAP) to selectively > block them based upon request state. Do you have an alternate explanation for the data we observe? > On the other hand, I do notice that Firefox has the ability to selectively > configure how Referer headers are blocked, both in terms of same-site vs. > cross-site and HTTP vs. HTTPS; > http://kb.mozillazine.org/Network.http.sendRefererHeader This preference blocks both the Referer header and and the document.referrer property (see documentation at <http://kb.mozillazine.org/Network.http.sendRefererHeader>). Our data indicates that a vanishing number of users enable this preference. > http://kb.mozillazine.org/Network.http.sendSecureXSiteReferrer This blocks cross-domain Referer headers when both domains are using HTTPS. Our data indicates that virtually no one enables this preference. > Couldn't that account for at least a portion of the discrepancies you saw? No, for the reasons stated above. > BTW, did you look for vanilla wafers > <http://www.junkbusters.com/ijbfaq.html#wafers> to see how much of this > stripping could be attributed to JunkBuster? Unfortunately, we did not. If we'd known about them at the time, then we would have. As far as I can tell from the JunkBuster documentation at <http://www.junkbusters.com/ijbman.html>, they block the Referer header for both same-domain and cross-domain requests. > Also, did you find any rationale for the difference between rates seen on > network A vs. network B? It's a pretty wide range... This is a bit of a puzzle. I suspect these network are targeting different demographics and the truth lies somewhere in the middle. > The numbers that I found especially interesting were for stripping of > same-site XmlHttpRequest-generated Referer headers, which came in at > (eyeballing Figure 3) about 0.6% on HTTP and 0.2% on HTTPS (discounting the > Firefox 1.x bug, which isn't relevant to this discussion, since we're > talking about updated browsers as a pre-condition). Aren't these numbers > closer to what one would expect? I'm not surprised by a 3% suppression rate given the feedback from Web sites that try strict Referer validation. Note that the same-domain suppression rate of the Referer header is much larger, around 6%, on network B. > In particular, they're much closer to the > numbers for "custom" headers that you measure, which means we are looking at > implementations that white-list as a significant factor (as well as > statistical error, of course)... The "suppression" of custom headers appears to be mostly due to oddball user agents that don't have fully implemented XMLHttpRequest objects. Looking at events that occur less than 0.2% of the time puts you WAY out in the tail of user agents. Statistical error is not much of a factor, given the enormous sample size. Sampling bias is a concern, however, which is why we tried two sampling techniques. > Lastly -- Figure 3 says that its unit is requests; wouldn't IP addresses be > a more useful number here? Or, better yet, unique clients (tracked with a > cookie)? Otherwise it seems that the results could be skewed by, for > example, a single very active proxy. We ran the numbers all three ways and they're similar. Tracking unique users with cookies is a bit unreliable due to third-party cookie blocking in some user agents. > Likewise, did you record the > geographical distribution of clients? It would be nice to have assurances > that this sample represents a global audience, and not just a selective > (read: US) one. Our add campaigns targeted the US. We were unable to record IP addresses due to ethical concerns. We did record a keyed hash of the IP addresses to re-identify requests from the same IP address, but we deleted the key after the experiment to prevent further re-identification. >> Unfortunately, these proxies prevent Web sites from relying on the >> Referer header, and so the operators of these proxies never come under >> pressure to stop suppressing the header. > > Certainly they do. If Cool New Cross-Site Web Apps are broken, and they > explain to the user why it is broken, both ISPs and companies will come > under pressure. Unfortunately, folks developing Cool New Cross-Site Web Apps know that the Referer is often suppressed and therefore do not rely on the header in their designs. Thus, step 1 never occurs. > IMO 99% of the driving factor for deployment here is going to be new > features -- supporting cross-site XmlHttpRequest with authentication, etc. Great. A number of browser vendors are interested in implementing the header, giving their users yet more reasons to upgrade. > Well, we'd be in the same situation as today; a current (non-Origin) browser > would be able to make cross-site requests (using IMG, form.submit, etc.). The Origin header is incrementally useful as a CSRF defense. Users with supporting user agents will benefit. Users without supporting user agents will be no worse off than they are today. This is different than the situation we are in today because sites must engineer complex CSRF defense to help any of their users. The Origin header lets sits protect some of their users with minimal effort. Adam
Received on Sunday, 25 January 2009 19:31:49 UTC