- From: Martin Thomson <martin.thomson@gmail.com>
- Date: Tue, 28 Aug 2018 17:25:24 +1000
- To: Mike West <mkwst@google.com>
- Cc: HTTP Working Group <ietf-http-wg@w3.org>
Resetting on this, because I want to make a meta-level comment after having spent some time thinking about it more. There are three classes of problems that the write-up claims to be addressing: Security, Inefficiency, and Privacy. Security: Cookies aren't well scoped in several ways. This will be a theme I get back to, particularly the origin-scoping thing, but there are two things that directly read on CSRF that are fairly compelling. It's sad to see that HttpOnly is still so rarely used and that script can often read cookies when it shouldn't be able to. And I'm not going to defend the use of cookies without Secure. HttpOnly using being at sub-10% is a warning sign we should probably pay attention to. It suggests that attempts to deploy this sort of mitigation aren't failing for lack of mechanisms, but from an absence of the right incentives. It would be interesting to see whether document.cookie is really used that much and to see if there were ways to close it off (CSP is an obvious path here), but this is a primarily a problem of incentives. Sites are completely in control of this. In other parts of the industry, you would throw some combination of linting and public awareness campaigns at this class of problem. You might also do things like change the default in common server containers and stacks. The Secure problem is fixed by moving more to HTTPS and HSTS, pure and simple. I don't see much point in looking for alternative solutions for this class of problem. Inefficiency: Here, I can't really bring myself to care. Inefficiency is annoying, but I'd rate cookie bloat pretty low on the list. And this is entirely in the control of sites. This does have some small effect on performance, but I'd rather tackle this one as part of making sites more responsible for their bad practices. My understanding is that cookie size isn't that big a contributor to performance woes in comparison to things like video, images, bloated JS frameworks, fonts, then maybe kitchen-sink CSS. Privacy: The first thought when I saw this was "great, progress". And then I watched a presentation on cookie matching and real-time bidding [1]. That reminded me that as long as we have incentives in place and poor technical defenses, this is likely to be an effective privacy measure either. As long as sites can each get their own identifier, and that identifier is stable for a given origin as the top-level browsing context visits different pages, then the trackers can match up identifiers. The research I saw found most identifier matching happens via URL parameters on redirects, but bigger networks are able to use back-end systems to link identifiers. In terms of privacy, I hold hope for two technological mitigations: content blocking, and double keying. Content blocking because it means refusing to talk to bad actors. Double-keying cookie stores by top-level browsing context and target origin is probably less crude in terms of collateral damage to those systems that rely on cookies (though that might derive more from selective application to "ads" or "trackers", than from any inherent property). The ultimate defense probably also relies on fingerprinting resistance, but we're still working up to a solid defense there. The problem with any attempt to break linkability of identifiers is that you need to break the link at every layer of the stack. When you change one identifier, you need to change all of them. Double-keying might work better because it changes cookies - and all storage - for all origins when the top-level context changes origin. This is a hard problem, because we often rely on example.com being able to access a shared context when accessed from x.example and y.example. Login flows often depend on this. It's also imperfect as a defense because information can flow across top-level navigations (again via URL parameters), and we've ample evidence that redirect chains at that level will be used if other methods are taken away. Techniques like what has been proposed here, when deployed in isolation, might be a stepping stone toward a grander plan, but this doesn't seem to be part of any strategy I can conceive of (if I missed something, feel free to enlighten me). Right now, I see more hope for measures that focus on narrowing the scope of applicability of cookies. That means double-keying, but it also includes things like narrowing the time over which cookies are stored (millenia is a joke) as we've discussed for insecure cookies, and stopping their propagation across origins in various ways. Part of the problem here is that you aren't willing to break any use cases [2]. But as long as those use cases align with things like tracking, I think that we ultimately need to consider some classes of breakage to be on the table. So, I've concluded that there is no motivation in favour of this proposal. I'm not really happy saying that, because cookies are terrible and awful and in desperate need of fixing, but there it is. [1] https://www.youtube.com/watch?v=GLvug8jdges [2] https://developers.google.com/authorized-buyers/rtb/cookie-guide is definitely a use case. As is federated sign-on. Both exploit the same properties. On Tue, Aug 14, 2018 at 8:49 PM Mike West <mkwst@google.com> wrote: > > Hey folks, > > https://github.com/mikewest/http-state-tokens suggests that we should introduce a client-controlled, origin-bound, HTTPS-only session identifier for network-level state management. And eventually deprecate cookies. > > I think there's a conversation here worth having, and this group has thought a lot about the space over the last decade or two. I'd appreciate y'all's feedback, both about the problems the document discusses with regard to cookies as they exist today, and about the sketchy proposal it advances about managing HTTP state in the future. > > Thanks! > > -mike
Received on Tuesday, 28 August 2018 07:25:57 UTC