Re: Proposal: a "clear site data" API. from Mike West on 2015-06-15 (public-webappsec@w3.org from June 2015)

From: Mike West <mkwst@google.com>
Date: Mon, 15 Jun 2015 11:32:05 +0200
To: Jonathan Kingston <jonathan@jooped.com>
Cc: Alex Russell <slightlyoff@google.com>, Tanvi Vyas <tanvi@mozilla.com>, Brad Hill <hillbrad@gmail.com>, "public-webappsec@w3.org" <public-webappsec@w3.org>, Richard Barnes <rbarnes@mozilla.com>, Jake Archibald <jakearchibald@google.com>, Anne van Kesteren <annevk@annevk.nl>, Martin Thomson <martin.thomson@gmail.com>, Jonas Sicking <jonas@sicking.cc>
Message-ID: <CAKXHy=dpn0qDu-hCdcR8FpUYPjVx7-+6fJbt0i6RqgWdVuGA0g@mail.gmail.com>

Thanks for the continued feedback!

On Sun, Jun 14, 2015 at 11:06 AM, Jonathan Kingston <jonathan@jooped.com>
wrote:

> It could be made generic, I'm not sure if advising the user agent to use
> console is the right idea however something like: "User agents SHOULD offer
> developer debug once the site data is cleared."
> This could be in the form of a stop icon on the network debug panel or a
> fatal exception style message in console.
>

I'll add a note along these lines. It's certainly a reasonable request.

> I still don't understand the use-case for a DOM event (though I think we'd
>> end up triggering a few as a side-effect of clearing things), but it might
>> make sense to confirm to the server that data was removed by sending a
>> `Clear-Site-Data: cleared` header along with the reload request. Not sure
>> if that's worth another round-trip, but it's certainly possible to do.
>>
>
> The use-case is a plugin handles a logout request to the server and a
> separate single page app framework isn't able to view the response from the
> server to the plugin. The framework might have a hard time showing to the
> user that they are now logged out.
>

A "plugin" as in Flash? Or a "plugin" as in a browser extension? Or a
third-party JavaScript component? IFrame?

So the current suggestion is that the user input will stay? Obviously this
>>> means that bad scripts could just cache data, it can grab into hidden
>>> fields or whatever it can to avoid the clearing.
>>>
>>
>> Yes, but it would stay in a heavily sandboxed execution context. No
>> script, no storage, a unique origin, etc. I don't think there would be any
>> way to read data out of such a context (no script === no response to
>> postMessage, unique origin === no direct DOM access), nor would explicit
>> exfiltration be possible (no script === no triggered resource loads, etc).
>>
>> I might well be missing something, however, which is why I think Alex's
>> suggestion to hard-reload (which I read as "go all the way to the server")
>> is appealing. It also makes the story simpler for users to understand ("I
>> logged out over there, so I'm logged out over here too."), which is a nice
>> benefit.
>>
>
> I struggle to see the point of keeping the data at all if it essentially
> becomes unusable to the app. Unless perhaps on tabs other than the current
> you could potentially show "This tab has been deactivated please back up
> any user input" style message.
>

Keeping the data is purely a side-effect of the the simplest mechanism I
could some up with to safely lock down the various execution contexts I
think we agree we need to deal with. It's

If replacing the page with a "Hey, we just wiped everything off this
origin." interstitial is the right thing to do, we can do that as the first
step. If we tend towards reloading the pages with a request all the way
back to the server, we'll need to have a two-step mechanism which
sandwiches the clearing event.

I don't have a strong opinion about which direction to go, so I'd tend
towards the sandwich, as that seems like less work.

> I don't understand why this would be desirable. The use-cases explicitly
>> distrust the client, positioning the server as the source of truth.
>> Moreover, origins are already granted control over their data without user
>> mediation (e.g. there's no permission dialog for `localStorage.clear()`),
>> and can already clear most of what's being discussed (though it's a good
>> deal of work to do correctly). What about the functionality we're
>> discussing here do you think crosses over into something that the user
>> ought to be involved in?
>>
>
> My worries are:
> - Increasing the risks of proxying on the server side to other external
> services. If a service was sanitizing the body of the message rather than
> the headers then this could be another issue (Obviously this is a bad
> practice anyway but it does happen).
>

The risk you're suggesting here is that an untrusted third-party resource
could be proxied through `example.com` with headers intact, thereby
clearing `example.com`'s data?

> - Single page app user experience. I'm browsing the app and all of a
> sudden one of my tabs receives a response to clear. The interface freezes
> then refreshes. As user experiences go that could be confusing / lose data.
>

I don't think this is limited to single-page apps. The user experience here
is certainly something to worry about.

> - Passive network attackers have a nice attack vector to clear all user
> data. On wifi/stolen base unit where DNS could be poisoned perhaps.
>

Pedantic nit: attackers wouldn't be "passive" if they were injecting
headers. :)

Martin (and Henri?) might claim that this is an advantage, as it makes HTTP
less viable for persistence. I don't think we're at a point in our
collective migration where we those claims would outweigh the annoyance to
users, but it's a discussion worth having.

The strawman limits the feature to HTTPS to mitigate exactly this risk.

> - Misconfigured servers send out the header. Due to the need to clear
> service workers, I can see that a server may be over zealous in sending out
> the header. If a service needs to manage state of when old state needs
> removing I can already see this happening.
>

> - Increased attack vector for CSRF scripts to abuse - loss of user data
> and simple way to create a bad user experience.
>

Clearing user data seems like the least of a server's problems if it's
exposing a CSRF or XSS vulnerability.

> This could perhaps be mitigated by only allowing first-party scripts the
> access to this clearing API.
>

"First-party" in what sense? Do you mean you'd suggest parsing/reacting to
the header only in a top-level browsing context? Or that you'd only allow
requests from the same origin (eTLD+1? something else?) to take effect?

That latter bit sounds like a reasonable restriction, IMO. I think the
former would be too restrictive, as it would prevent scenarios like
https://mikewest.github.io/webappsec/specs/clear-site-data/#example-targeted
.

> - Retain cookies or contexts suffer from the same risks as retaining
> indexedDB and files.
>

I don't understand. Would you mind elaborating a bit?

Thanks!

-mike

Received on Monday, 15 June 2015 09:32:59 UTC