Fixing cookies (Re: Some half-baked thoughts about cookies.) from Martin Thomson on 2018-08-28 (ietf-http-wg@w3.org from July to September 2018)

From: Martin Thomson <martin.thomson@gmail.com>
Date: Tue, 28 Aug 2018 17:25:24 +1000
To: Mike West <mkwst@google.com>
Cc: HTTP Working Group <ietf-http-wg@w3.org>
Message-ID: <CABkgnnW1FZGx23JUsaRKEDOYf5=zhhfTN6NEx+=Dfa8pMMe=dw@mail.gmail.com>
Resetting on this, because I want to make a meta-level comment after
having spent some time thinking about it more.

There are three classes of problems that the write-up claims to be
addressing: Security, Inefficiency, and Privacy.

Security:

Cookies aren't well scoped in several ways.  This will be a theme I
get back to, particularly the origin-scoping thing, but there are two
things that directly read on CSRF that are fairly compelling.  It's
sad to see that HttpOnly is still so rarely used and that script can
often read cookies when it shouldn't be able to.  And I'm not going to
defend the use of cookies without Secure.

HttpOnly using being at sub-10% is a warning sign we should probably
pay attention to.  It suggests that attempts to deploy this sort of
mitigation aren't failing for lack of mechanisms, but from an absence
of the right incentives.  It would be interesting to see whether
document.cookie is really used that much and to see if there were ways
to close it off (CSP is an obvious path here), but this is a primarily
a problem of incentives.  Sites are completely in control of this.  In
other parts of the industry, you would throw some combination of
linting and public awareness campaigns at this class of problem.  You
might also do things like change the default in common server
containers and stacks.

The Secure problem is fixed by moving more to HTTPS and HSTS, pure and
simple.  I don't see much point in looking for alternative solutions
for this class of problem.

Inefficiency:

Here, I can't really bring myself to care.  Inefficiency is annoying,
but I'd rate cookie bloat pretty low on the list.  And this is
entirely in the control of sites.  This does have some small effect on
performance, but I'd rather tackle this one as part of making sites
more responsible for their bad practices.  My understanding is that
cookie size isn't that big a contributor to performance woes in
comparison to things like video, images, bloated JS frameworks, fonts,
then maybe kitchen-sink CSS.

Privacy:

The first thought when I saw this was "great, progress". And then I
watched a presentation on cookie matching and real-time bidding [1].
That reminded me that as long as we have incentives in place and poor
technical defenses, this is likely to be an effective privacy measure
either.  As long as sites can each get their own identifier, and that
identifier is stable for a given origin as the top-level browsing
context visits different pages, then the trackers can match up
identifiers.  The research I saw found most identifier matching
happens via URL parameters on redirects, but bigger networks are able
to use back-end systems to link identifiers.

In terms of privacy, I hold hope for two technological mitigations:
content blocking, and double keying.  Content blocking because it
means refusing to talk to bad actors.  Double-keying cookie stores by
top-level browsing context and target origin is probably less crude in
terms of collateral damage to those systems that rely on cookies
(though that might derive more from selective application to "ads" or
"trackers", than from any inherent property). The ultimate defense
probably also relies on fingerprinting resistance, but we're still
working up to a solid defense there.

The problem with any attempt to break linkability of identifiers is
that you need to break the link at every layer of the stack.  When you
change one identifier, you need to change all of them.  Double-keying
might work better because it changes cookies - and all storage - for
all origins when the top-level context changes origin.  This is a hard
problem, because we often rely on example.com being able to access a
shared context when accessed from x.example and y.example.  Login
flows often depend on this.  It's also imperfect as a defense because
information can flow across top-level navigations (again via URL
parameters), and we've ample evidence that redirect chains at that
level will be used if other methods are taken away.

Techniques like what has been proposed here, when deployed in
isolation, might be a stepping stone toward a grander plan, but this
doesn't seem to be part of any strategy I can conceive of (if I missed
something, feel free to enlighten me).  Right now, I see more hope for
measures that focus on narrowing the scope of applicability of
cookies.  That means double-keying, but it also includes things like
narrowing the time over which cookies are stored (millenia is a joke)
as we've discussed for insecure cookies, and stopping their
propagation across origins in various ways.

Part of the problem here is that you aren't willing to break any use
cases [2].  But as long as those use cases align with things like
tracking, I think that we ultimately need to consider some classes of
breakage to be on the table.

So, I've concluded that there is no motivation in favour of this
proposal.  I'm not really happy saying that, because cookies are
terrible and awful and in desperate need of fixing, but there it is.

[1] https://www.youtube.com/watch?v=GLvug8jdges
[2] https://developers.google.com/authorized-buyers/rtb/cookie-guide
is definitely a use case.  As is federated sign-on.  Both exploit the
same properties.

On Tue, Aug 14, 2018 at 8:49 PM Mike West <mkwst@google.com> wrote:
>
> Hey folks,
>
> https://github.com/mikewest/http-state-tokens suggests that we should introduce a client-controlled, origin-bound, HTTPS-only session identifier for network-level state management. And eventually deprecate cookies.
>
> I think there's a conversation here worth having, and this group has thought a lot about the space over the last decade or two. I'd appreciate y'all's feedback, both about the problems the document discusses with regard to cookies as they exist today, and about the sketchy proposal it advances about managing HTTP state in the future.
>
> Thanks!
>
> -mike
Received on Tuesday, 28 August 2018 07:25:57 UTC