Re: [whatwg/storage] StorageEstimate.quota + Storage Pressure (#73)

That's consistent with my understanding.

I think we also discussed that it might make sense to generate courtesy events to update origins on when their usage crosses certain thresholds so they don't need to poll.  These would want to be de-bounced space-wise so that oscillating around a certain size doesn't generate a large number of events and time-wise so that timing side-channels aren't accidentally created.  For example a random delay before hooking the event up to the idle timeout or something.  The goal would be to avoid accidentally exposing implementation details, including things like GC which is made evident by the collection of the last in-memory handle to a disk-backed Blob/File, etc.

To explicitly state the primary problematic scenario I remember from the discussion:

### Problematic Scenario
- Assume the browser maintains and persists some type of browser-local/user-profile-local (which may be synchronized with the user's other browser profiles via opt-in sync mechanism) site engagement/frecency metric that tracks the user's interaction with a site over time and this informs quota-related decisions.
- The user browses to a site the browser has no site engagement data for the user for, or limited site engagement data for.  This could be due to a new browser profile, because of privacy settings/data-clearing, or other.
- The site wants to synchronize a large amount of data and the desired UX is that the user can make the decision to synchronize this amount of data up-front.  For example, in an offline mail webapp, the user might choose to synchronize N weeks of data and be told the expected size.  Alternately, the user might be using a video streaming site that allows saving videos for offline usage to be used from within the site, so not full videos.
- The user is also synchronizing a large amount of data outside of the browser's quota management system and this cuts into space the browser thought it could use and allocate storage to.
- The site can potentially extract some amount of entropy from either hitting a QuotaExceededError or from its quota being reduced.
- Because we're dealing with a scenario that assumes low engagement/newly visited sites, even if the quotachange reduction has low entropy, an attacker could potentially perform this attack across multiple distinct origins in parallel or in serial in order to attempt to extract and aggregate additional entropy.
- A bad actor can easily make its behavior more directly resemble that of a legitimate site if necessary to defeat simple heuristics.  For example, if we require there be network usage commensurate with quota requests, an attacker is probably just as happy to use up the user's data.  (That said, there are likely cases where some would-be fingerprinters like ad-tech would be dissuaded from stepping over certain lines that other bad actors would not.)

### My hand-wavey proposals
Just for my own future reference, my general proposals for this area had been:

- Favor a strategy of having sites make small incremental quota requests as space is used and they need more space.  For example, in 50/100 MiB chunks.  While there's entropy in when the browser starts saying no, this can be more directly tied to user engagement and rate-limiting.
  - This could allow the browser to surface an ambient indication that the page is doing a lot of work related to I/O and let the user stop the growth.
  - This makes it easier to evolve an understanding of user engagement as the synchronization happens.  While a user may not want to watch the progress bar on the mail webapp or video streaming site, we would expect them to leave the tab open.  This would differ from a random site the user lands on for a few seconds that installs a ServiceWorker, uses an in-page prompt to ask the user if it's okay to bother them with push notifications, and then uses the user interaction from the "No!" prompt or close ad prompt in order to count as user interaction and then request a very large quota grant and then fires an event at the ServiceWorker that the ServiceWorker does a never-resolving waitUntil() to get as much runtime as possible to listen to storage quotachange events.  And presumably the site might also attempt to redirect to another top-level site where it repeats the process so it can have multiple ServiceWorkers each trying to gather some number of bits of entropy.
- Strongly limited third party origins' quota grants.
- Use explicit prompting UI for requests for large quota grants when there isn't already a strong site engagement score.  The expectation is that the site would have primed the user for the prompt with their UX similar to how native apps and web sites requesting push notifications first use in-app/in-page explanations before prompting.  This prompt would follow standard browser guidelines of not letting the site provide any information besides the quota grant request (which would also rounded and provided in human understandable units and contextualized in terms of total storage space or free space).
  - This is important because the larger the grant the more potentially entropy in any decrease in quota.  So we want to limit the number of large, speculative grants.
  - The downside to this is that browser UX is frequently reluctant to prompt.  However, this can be mitigated by only prompting when there is insufficient site engagement score or not prompting for Installed Web Apps.
  - Additionally, I think at this point all the major browser vendors provide and attempt to onboard new profiles with account sync which includes the data required to calculate site engagement, so in many cases site engagement data would already be available.  And in the cases of users who explicitly do not use account sync or explicitly use the browser in configurations that purge such data periodically, it seems likely these users would indeed prefer prompting.
- Allow (implicit?) quota grants related to APIs like background-fetch which potentially provide a good UX for users already by ambiently surfacing the fact that a download is happening and how much is being downloaded and allowing the user to cancel the download (and thereby revoke the tentative storage grant).


-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/whatwg/storage/issues/73#issuecomment-535652415

Received on Thursday, 26 September 2019 19:30:00 UTC