Re: [w3ctag/design-reviews] Early design review for the Topics API (Issue #726)

There are a few places where some of the assurances described in the beginning of the discussion (quite a while ago now!), don't quite track what is in the spec.

- The discussion here states the taxonomy is coarse-grained, but the spec does not limit the depth of the taxonomy. From this [discussion](https://github.com/patcg-individual-drafts/topics/issues/229#issuecomment-1664141004), it may be intentional that the spec would allow a taxonomy of a billion items. A bit more on this in a separate section below.

- The discussion here states the “taxonomy name is its semantic meaning”, but the spec does not require that a topic have more than an integer ID. There is no requirement for a human-readable taxonomy name, nor for a utility for localizing that name.

- The discussion here states that the taxonomy will exclude sensitive topics and hews to certain existing taxonomies, but the spec does not provide for any assurance or process for this.

- The discussion here states that the spec has done as much as it can to allow for user consent, as this is generally left to UX implementation, but it is not clear that the permissions framework wouldn’t offer other options, such as treating each topic as a powerful feature or requiring powerful feature treatment of Topics.

- The discussion here implies that security concerns are minimized because topics calculation will be done on the domain or url and occur locally in the browser, but the spec would allow the implementer to analyze the entire document in the context of the implementer’s choosing, including a server. While today’s specs allow server-based browser implementation, it is rare, and the marketing for the Privacy Sandbox features on-device processing pretty prominently.

To return to a moment to the assertion that a billion topics would mean no privacy loss because only five may be eligible for reporting out to sites.

- A billion topics would invalidate all of the applicable analyses of the cross-identification probability. For example, the theoretical limit for leakage (log2(N,k) where N is taxonomy size and k is topics tracked) would go from ~6 bits to ~29 bits for an integer.

- Users can’t proactively review and opt-out of a billion topics.

- The five-percent random results would not ensure that all topics had users if there are a billion topics.

- It is not clear that the security and privacy concerns could be addressed by relying on the fact that cookies have effectively less data than topics.

- Expanding to a taxonomy of a billion puts a lot more stress on the other assurances. For security, this is significantly better than the data stored for third-party cookies.

- The rewards for site collusion to game the system would be much higher. These may not have been explored in sufficient detail for a coarse-grained taxonomy, or even for trying to game multiple taxonomies; they certainly haven’t for a super-fined-grained one.

Finally one question one might ask: why comment on a spec when there seems so small a chance of cross-browser implementation? 

As an enterprise customer of Google Workspace and Chrome, I am already subjected to small, creeping changes to the interpretation of the terms of service – and updates to those terms that are difficult to opt out of. So, even if Chrome is the only full implementer, I would rather see the critical privacy promises in a draft spec so that they stick for longer.

Also, it is really important that implementations match their marketing. There are big implications for the web as a whole if the most popular browser can market a feature as "only local calculation of coarse-grained topics" when we decide to opt in, but then, since they don't think it is a big deal, change that over time.





-- 
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/726#issuecomment-1666066303
You are receiving this because you are subscribed to this thread.

Message ID: <w3ctag/design-reviews/issues/726/1666066303@github.com>

Received on Friday, 4 August 2023 19:18:05 UTC