RE: Input on privacy threat model from WebKit tracking prevention policy from Mike O'Neill on 2020-02-17 (public-privacy@w3.org from January to March 2020)

From: Mike O'Neill <michael.oneill@baycloud.com>
Date: Mon, 17 Feb 2020 09:12:20 -0000
To: <mjs@apple.com>, <public-privacy@w3.org>
Message-ID: <028301d5e572$5cedf9f0$16c9edd0$@baycloud.com>
This is very good Maciej. 

I would remove the sentence about  personally identifiable data because this subset of personal data only has meaning in some jurisdictions such as the US, and even there it is becoming less important.

I would also remove the bit about same party and registerable domains, because it does not add to the clarity of the other definitions. Quite often subdomains are managed by third-parties i.e. via controlling CNAMES.

Mike



-----Original Message-----
From: mjs@apple.com <mjs@apple.com> 
Sent: 16 February 2020 20:27
To: public-privacy@w3.org
Subject: Input on privacy threat model from WebKit tracking prevention policy

Hi everyone,

Not everyone may agree with all elements of the WebKit Tracking Prevention Policy <https://webkit.org/tracking-prevention-policy/>. However, I think the taxonomy of types of tracking is useful to include in a Privacy Threat Model. With common terminology, it’s easier to talk about what is or isn’t considered part of the threat model.

I know that tracking is not necessarily the only privacy consideration on the web, but its an important one, and it helps to have a clear vocabulary to talk about it. Also worth noting, some of these definitions are stricter than in the similar [Mozilla Anti tracking policy](https://wiki.mozilla.org/Security/Anti_tracking_policy), with somewhat different terms. But the two policies end up prohibiting nearly the exact same things in the end.



I include the definitions below as markdown, and excluding one WebKit-specific remark, but this may be easier to read on the webkit.org page linked above.


## Tracking Definitions
**Tracking** is the collection of data regarding an individual’s identity or activity across one or more websites. Even if such data is not believed to be personally identifiable, it’s still tracking.

A **first party** is a website that a user is intentionally and knowingly visiting, as displayed by the URL field of the browser, and the set of resources on the web operated by the same organization. In practice, we consider resources to belong to the same party if they are part of the same *registerable domain*: a [*public suffix*](https://publicsuffix.org) plus one additional label. Example: `site.example`, `www.site.example`, and `s.u.b.site.example` are all the same party since `site.example` is their shared registrable domain.

A **third party** is any party that does not fall within the definition of first party above.

A **privileged third party** is a party that has the potential to track the user across websites without their knowledge or consent because of special access built into the browser or operating system. Examples: a central clearinghouse that can learn of a user’s browsing; a domain uniquely allowed to host tracking scripts by the browser.

Interactions with other parties are considered third-party, even if the user is transiently informed in context (for example, in the form of a redirect). Merely hovering over, muting, pausing, or closing a given piece of content does not constitute an intention to interact.

## Types of Tracking
**Cross-site tracking** is tracking across multiple first party websites; tracking between websites and apps; or the retention, use, or sharing of data from that activity with parties other than the first party on which it was collected.

**Stateful tracking** is tracking using storage on the user’s device. This storage can be ephemeral or persistent. Such storage includes but is not limited to cookies, DOM storage, IndexedDB, the HTTP cache and other caches, HSTS, and media keys. It also includes tracking via communication mechanisms that are potentially accessible cross-site, such as Service Workers or Broadcast Channels.

**Covert stateful tracking** is stateful tracking which uses mechanisms that are not intended for general-purpose storage, such as HSTS or TLS.

**Navigational tracking** is tracking through information controlled by the source of a top-level navigation or a subresource load, transferred to the destination. This includes URL parameter-based tracking or link decoration, which is tracking via information added to URLs, and HTTP header data that can be set up to include tracking information, such as the referrer.

**Fingerprinting**, or **stateless tracking**, is tracking based on the properties of the user’s behavior and computing environment, without the need for explicit client-side storage. This includes properties of user’s web browser and its configuration, the user’s device and its configuration, the user’s location, or the user’s network connection. Fingerprinting vectors include but are not limited to installed fonts, the user agent string, GPU details, CPU details, IP address, and TLS connection.

**Covert tracking** includes **covert stateful tracking**, **fingerprinting**, and any other methods that are similarly hidden from user visibility and control.

Tracking may also be performed using currently unknown techniques that do not fall into these categories.
Received on Monday, 17 February 2020 09:12:37 UTC