Re: [w3ctag/design-reviews] Shared Storage API (Issue #747) from Josh Karlin on 2023-03-24 (public-webapps-github@w3.org from March 2023)

From: Josh Karlin <notifications@github.com>
Date: Fri, 24 Mar 2023 06:34:50 -0700
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/747/1482807424@github.com>
Hi folks, thanks for the questions.

> We are concerned about exposing such a complex low level API to authors (see [Design Principles: high vs low level API tradeoffs](https://www.w3.org/TR/design-principles/#high-level-low-level)). This proposal is about much more powerful functionality than only sharing access to storage. In particular, with regards to accommodating use cases that are no longer met once third-party cookies are deprecated, we strongly encourage addressing these in a focused, case-by-case way, rather than in the general sense. For example, you mention single sign-on as a use case, but we understand that FedCM is being worked on specifically to address this case.

The design principle example shows a high level vs low level tradeoff where the low level reveals substantially more data than the high level. In the case of Shared Storage, the idea is to create output gates that are only capable of revealing small amounts of data in the long run. For measurement use cases, the differential privacy parameters of the private aggregation API control the rate. For dynamic document selection (selectURL), there is an entropy rate limit. These rates are defined by the user agent. I do agree that a low-level API would need to support a higher rate than an individual purpose-built API as it supports more use cases. But it’s not clear to me that it would need to provide a higher rate than all of the purpose-built APIs combined.

Note that even for the purpose built APIs, what gets sent is left up to the sender. The privacy comes from the mechanism (e.g., differential privacy or entropy rates). So fundamentally, as long as the privacy mechanisms are in place, I think we should explore more expressive APIs that allow for technical innovation in parallel to the purpose-built ones. This can help to cover the use cases that we’ve missed before third-party cookie deprecation, and also serves as a proving ground for new purpose-built APIs in the future. The Storage Access API exists as a similar catch-all.


> Could you provide us with a summary of how Shared Storage fits in with the other Privacy Sandbox proposals (such as Fenced Frames, Topics, First Party Sets)? Are there any duplicated functionality / use cases among them? Where are the overlaps?

Shared Storage has three components. The first is the unpartitioned storage API (write data from anywhere, read it only in an isolated worklet). The other two components are the private ways in which data can leave the worklet. Private Aggregation API is a measurement API, allowing for measurement of things like ad reach, demographics, or cross-site debug reporting. This complements the other APIs (such as Attribution Reporting, FLEDGE, etc.) to cover measurement use cases that they don't explicitly support yet. SelectURL allows for choosing between documents to display in a fenced frame based on shared storage data. This could be useful for choosing between contextual ads that FLEDGE wasn’t involved with choosing, for running cross-site A/B experiments, for payment and login  providers to show different buttons based on the user’s logged-in status, etc. It leverages fenced frames to limit the choice of selected document from being revealed to the embedding page.

> We appreciate that you see this proposal as providing a privacy improvement compared to the status quo of third-party cookies on the web, however would you be able to give us an analysis of the privacy implications in comparison with the web without third-party cookies as the baseline?

Sure. Private aggregation can reveal cross-site data at a rate defined by its differential privacy parameters at a per-origin scope.  SelectURL can reveal up to X bits of cross-site entropy (also origin scoped). These limits are reset at a defined rate. All of the parameters are set by the user agent. This is similar to the limits for other privacy-preserving APIs such as Attribution Reporting, FLEDGE, Topics, PCM, and IPA from Mozilla & Meta.

I think it’s important to point out that in this baseline you’re proposing, we’re seeing that other methods are emerging to personalize ads for users, including [fingerprinting](https://www.wired.com/story/browser-fingerprinting-tracking-explained/) and [increased usage of PII gates](https://www.nytimes.com/2023/01/25/technology/personaltech/email-address-digital-tracking.html). This is why our work is so important. We’re developing privacy-safe ways to enable personalized advertising that don’t rely on user PII or fingerprinting.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/747#issuecomment-1482807424
You are receiving this because you are subscribed to this thread.

Message ID: <w3ctag/design-reviews/issues/747/1482807424@github.com>
Received on Friday, 24 March 2023 13:35:03 UTC