Re: [w3ctag/design-reviews] Early design review for the Topics API (Issue #726) from Amy Guy on 2023-01-12 (public-webapps-github@w3.org from January 2023)

From: Amy Guy <notifications@github.com>
Date: Wed, 11 Jan 2023 23:20:08 -0800
To: w3ctag/design-reviews <design-reviews@noreply.github.com>
Cc: Subscribed <subscribed@noreply.github.com>
Message-ID: <w3ctag/design-reviews/issues/726/1379908459@github.com>
The intention of the Topics API is to enable high level interests of web users to be shared with third parties in a privacy-preserving way in order to enable targeted advertising, while also protecting users from unwanted tracking and profiling. The TAG's initial view is that this API does not achieve these goals as specified.

The Topics API as proposed puts the browser in a position of sharing information about the user, derived from their browsing history, with any site that can call the API. This is done in such a way that the user has no fine-grained control over what is revealed, and in what context, or to which parties. It also seems likely that a user would struggle to understand what is even happening; data is gathered and sent behind the scenes, quite opaquely. This goes against the principle of [enhancing the user's control](https://w3ctag.github.io/ethical-web-principles/#control), and we believe is not appropriate behaviour for any software purporting to be an agent of a web user.

The responses to the proposal from [Webkit](https://github.com/WebKit/standards-positions/issues/111#issuecomment-1359609317) and [Mozilla](https://github.com/mozilla/standards-positions/issues/622#issuecomment-1372979100) highlight the tradeoffs between serving a diverse global population, and adequately protecting the identities of individuals in a given population. Shortcomings on neither side of these tradeoffs are acceptable for web platform technologies. 

It's also clear from the positions shared by Mozilla and Webkit that there is a lack of multi-stakeholder support. We remain concerned about fragmentation of the user experience if the Topics API is implemented in a limited number of browsers, and sites that wish to use it prevent access to users of browsers without it (a different scenario from the user having disabled it in settings).

We are particularly concerned by the opportunities for sites to use additional data gathered over time by the Topics API in conjunction with other data gathered about a site visitor, either via other APIs, via out of band means, and/or via existing tracking technologies in place at the same time, such as fingerprinting.

We appreciate the in-depth privacy analyses of the API that have been done so far [by Google](https://github.com/jkarlin/topics/blob/main/topics_analysis.pdf) and [by Mozilla](https://mozilla.github.io/ppa-docs/topics.pdf). If work on this API is to proceed, it would benefit from further analysis by one or more independant (non-browser engine or adtech) parties.

Further, if the API were both effective and privacy-preserving, it could nonetheless be used to customise content in a discriminatory manner, using stereotypes, inferences or assumptions based on the topics revealed (eg. a topic could be used - accurately or not - to infer a [protected characteristic](https://w3ctag.github.io/privacy-principles/#hl-sensitive-information), which is thereby used in selecting an advert to show). Relatedly, there is no binary assessment that can be made over whether a topic is "sensitive" or not. This can vary depending on context, the circumstances of the person it relates to, as well as change over time for the same person.

Giving the web user access to browser settings to configure which topics can be observed and sent, and from/to which parties, would be a necessary addition to an API such as this, and go some way towards restoring agency of the user, but is by no means sufficient. People can become vulnerable in ways they do not expect, and without notice. People cannot be expected to have a full understanding of every possible topic in the taxonomy as it relates to their personal circumstances, nor of the immediate or knock-on effects of sharing this data with sites and advertisers, and nor can they be expected to continually revise their browser settings as their personal or global circumstances change.

A portion of topics returned by the API are proposed to be randomised, in part to enable plausible deniability of the results. The usefulness of this mitigation may be limited in practice; an individual who wants to explain away an inappropriate ad served on a shared computer cannot be expected to understand the low level workings of a specific browser API in a contentious, dangerous or embarrassing situation (assuming a general cultural awareness of the idea of targeted ads being served based on your online activities or even being "listened to" by your devices, which does not exist everywhere, but is certainly pervasive in some places/communities).

While we appreciate the efforts that have gone into this proposal aiming to iteratively improve the privacy-preserving possibilities of targeted advertising, ultimately it falls short. In summary, the proposed API appears to maintain the status quo of inappropriate surveillence on the web, and we do not want to see it proceed further.

-- 
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/726#issuecomment-1379908459
You are receiving this because you are subscribed to this thread.

Message ID: <w3ctag/design-reviews/issues/726/1379908459@github.com>
Received on Thursday, 12 January 2023 07:20:21 UTC