Re: Input on privacy threat model from WebKit tracking prevention policy

> On Feb 17, 2020, at 1:12 AM, Mike O'Neill <michael.oneill@baycloud.com> wrote:
> 
> This is very good Maciej. 
> 
> I would remove the sentence about  personally identifiable data because this subset of personal data only has meaning in some jurisdictions such as the US, and even there it is becoming less important.

The only mention of personally identifiable data is to state that the given definition of “tracking” applies to more than just this subset. In light of that, I think it’s ok that it doesn’t have a legal definition. In the WebKit Tracking Prevention Policy, we included this to make clear that even tracking which does not link anything to a user ID is still tracking. I think it’s good to have this for clarity, since some other notions of tracking only cover linkage of user IDs across multiple sites, or association of information with a user ID.

> I would also remove the bit about same party and registerable domains, because it does not add to the clarity of the other definitions. Quite often subdomains are managed by third-parties i.e. via controlling CNAMES.

The original draft of this was written before the recent trend towards CNAME cloaking by trackers. That said, I think it might be correct to consider a party to include all subdomains of registrable domains for most purposes.

Cookies can be set for a whole registrable domain level. Even if third-party cookies are someday removed from the web platform, this will likely continue to be true. Thus, distinct subdomains of a registrable domain can freely share information about the user. Furthermore, it’s likely that users perceive “mail.google.com”, “accounts.google.com”, “drive.google.com” and “docs.google.com” to all be parts of the same entity, and they would be correct to do so.

Because of this, I don’t think we can have a notion of party that is strictly an origin, rather than an eTLD+1.

CNAME cloaking is mostly used to evade list or rule based load blocking. It does not, by itself, provide a tracking mechanism. Rather, it’s a way to evade some specific countermeasures.

> 
> Mike
> 
> 
> 
> -----Original Message-----
> From: mjs@apple.com <mjs@apple.com> 
> Sent: 16 February 2020 20:27
> To: public-privacy@w3.org
> Subject: Input on privacy threat model from WebKit tracking prevention policy
> 
> Hi everyone,
> 
> Not everyone may agree with all elements of the WebKit Tracking Prevention Policy <https://webkit.org/tracking-prevention-policy/>. However, I think the taxonomy of types of tracking is useful to include in a Privacy Threat Model. With common terminology, it’s easier to talk about what is or isn’t considered part of the threat model.
> 
> I know that tracking is not necessarily the only privacy consideration on the web, but its an important one, and it helps to have a clear vocabulary to talk about it. Also worth noting, some of these definitions are stricter than in the similar [Mozilla Anti tracking policy](https://wiki.mozilla.org/Security/Anti_tracking_policy), with somewhat different terms. But the two policies end up prohibiting nearly the exact same things in the end.
> 
> 
> 
> I include the definitions below as markdown, and excluding one WebKit-specific remark, but this may be easier to read on the webkit.org page linked above.
> 
> 
> ## Tracking Definitions
> **Tracking** is the collection of data regarding an individual’s identity or activity across one or more websites. Even if such data is not believed to be personally identifiable, it’s still tracking.
> 
> A **first party** is a website that a user is intentionally and knowingly visiting, as displayed by the URL field of the browser, and the set of resources on the web operated by the same organization. In practice, we consider resources to belong to the same party if they are part of the same *registerable domain*: a [*public suffix*](https://publicsuffix.org) plus one additional label. Example: `site.example`, `www.site.example`, and `s.u.b.site.example` are all the same party since `site.example` is their shared registrable domain.
> 
> A **third party** is any party that does not fall within the definition of first party above.
> 
> A **privileged third party** is a party that has the potential to track the user across websites without their knowledge or consent because of special access built into the browser or operating system. Examples: a central clearinghouse that can learn of a user’s browsing; a domain uniquely allowed to host tracking scripts by the browser.
> 
> Interactions with other parties are considered third-party, even if the user is transiently informed in context (for example, in the form of a redirect). Merely hovering over, muting, pausing, or closing a given piece of content does not constitute an intention to interact.
> 
> ## Types of Tracking
> **Cross-site tracking** is tracking across multiple first party websites; tracking between websites and apps; or the retention, use, or sharing of data from that activity with parties other than the first party on which it was collected.
> 
> **Stateful tracking** is tracking using storage on the user’s device. This storage can be ephemeral or persistent. Such storage includes but is not limited to cookies, DOM storage, IndexedDB, the HTTP cache and other caches, HSTS, and media keys. It also includes tracking via communication mechanisms that are potentially accessible cross-site, such as Service Workers or Broadcast Channels.
> 
> **Covert stateful tracking** is stateful tracking which uses mechanisms that are not intended for general-purpose storage, such as HSTS or TLS.
> 
> **Navigational tracking** is tracking through information controlled by the source of a top-level navigation or a subresource load, transferred to the destination. This includes URL parameter-based tracking or link decoration, which is tracking via information added to URLs, and HTTP header data that can be set up to include tracking information, such as the referrer.
> 
> **Fingerprinting**, or **stateless tracking**, is tracking based on the properties of the user’s behavior and computing environment, without the need for explicit client-side storage. This includes properties of user’s web browser and its configuration, the user’s device and its configuration, the user’s location, or the user’s network connection. Fingerprinting vectors include but are not limited to installed fonts, the user agent string, GPU details, CPU details, IP address, and TLS connection.
> 
> **Covert tracking** includes **covert stateful tracking**, **fingerprinting**, and any other methods that are similarly hidden from user visibility and control.
> 
> Tracking may also be performed using currently unknown techniques that do not fall into these categories.
> 
> 
> 

Received on Monday, 17 February 2020 19:10:07 UTC