Re: [w3ctag/design-reviews] First-Party Sets (#342)

Appreciate the discussion and feedback, all!

-----
Responding to @erik-anderson 

> My understanding is that the approval process for creating a set was added to the proposal in part because of concerns from other implementers that, even if the list size is kept small and sites are allowed to join one-and-only-one set that it would be insufficient due to abuses where sites owned by different entities collude to join the same set.

This is accurate. We updated the proposal to require an approval process [in response to feedback](https://github.com/privacycg/first-party-sets/issues/6#issuecomment-588342186) from Safari and Firefox engineers. The [original proposal](https://github.com/privacycg/first-party-sets/tree/c5e90c3abfbbcfa2e656745a992969befade778e#detecting-unacceptable-sets) allowed sites to assert domain relationships with some technically enforced limits and abuse prevention; accompanied by a blocklist when abuse is detected.


> A non-comprehensive list of areas I'd like to explore to mitigate the potential impact (which are not mutually exclusive) of the governance concern: make the max size of lists small enough to not need any approval (may not be practical due to the past concern about a lack of objective, user-intuitive criteria for when sites can join the same set); an independent entity to approve and/or revoke the ability to use a set, using a common set of criteria that multiple implementers agree to (a bit like CAs and web PKI, which carries its own set of challenges, though perhaps smaller in scope here); or "GREASE"ing of when First-Party Sets are used (e.g. disabling them some small percentage of the time and/or revoking the right to use them at all if the site doesn't function without them) to help sites prove/validate that they will function adequately for browsers and/or users who configure their browsers to limit or disallow the use of First-Party Sets.

These are some great specific ideas! Since relying on technical mechanisms alone was previously recommended against, our current preference is to have an independent entity approve/revoke sets. _GREASE_'ing is an interesting idea, although it sounds like we need to come up with (a) strong alternative solutions to help site authors support the clients with the feature disabled; and (b) build robust detection mechanisms to aid revocation.

The other mechanism I had in mind, also inspired from the Web PKI, is transparency logs for any creation/updation/dissolution of sets for increased accountability and auditability.

-----

Responding to @torgo 

Thanks to the TAG for meeting with us! We appreciated the opportunity to provide additional context on the problem space, current handling of tracking protection in other major browsers, clarify the confusion around why our proposal will not interfere with SOP and other security mechanisms, and address other points in your feedback document. We will also address these in the explainer, and look forward with reviewing the edits with you.

> If the governance is to make sure that FPS members are part of the same organization then what is the definition of organization and how does that fit together with legal and regulatory? For example, we discussed how under some definitions Facebook and WhatsApp might be the same organization - and just yesterday there was some [timely press coverage](https://www.bloomberg.com/news/articles/2021-04-13/facebook-faces-german-bid-to-halt-collection-of-whatsapp-data) demonstrating how that assumption breaks down when you consider regulatory and legal requirements. So I think the proposal needs to be very clear about the requirements when it comes to governance - what is governance of first party sets trying to achieve? 

I am not a policy/legal expert; but I don't think FPS policy verification precludes site authors from conforming to regulatory and legal requirements. Since the assertion needs to be submitted by site authors, they will still have the agency to not form a FPS at all, or form multiple distinct and disjoint sets. 

The primary goal of the FPS policy is to prevent abuse that may be possible by formation of sets with unrelated domains. We chose to use "same organization" because that appears to be the common language in tracking prevention policies published by multiple major browsers (see [this section](https://github.com/privacycg/first-party-sets#defining-acceptable-sets) for excerpts). The DNT specification, which was developed within the W3C [uses the language](https://www.w3.org/TR/tracking-dnt/#rep.same-party) "_share the same data controller as the referring site_". If there is more precise or appropriate language to capture the essence of these existing policies, I'd be grateful for any advice.

> I would like to hear more about the the existing allow lists that have been discussed - e.g. Disconnect, Firefox, Safari. How big are they? How are they managed?

My understanding is that in their default tracking protection modes, both Firefox and Edge (not Safari) use the Disconnect-dot-me [trackers blocklist](https://github.com/disconnectme/disconnect-tracking-protection/blob/master/services.json) to selectively block third-party cookies on domains classified as trackers, and then apply an exception to those tracker domains when they appear as subresources on sites owned by the same organization as the tracker domain. The list of these collections of commonly owned domains is called the entities list, which is [maintained on Github](https://github.com/disconnectme/disconnect-tracking-protection/blob/master/services.json). I was unable to locate a documented policy for how domains get accepted to the entities list; but it appears to be done on a pretty ad-hoc basis when a compatibility bug is discovered by a browser engineer. I do not believe site authors are involved at all; which I think has the unfortunate consequence of mistakes such as com.com being listed as a CBS property, and yahoo.co.jp being listed as a VerizonMedia property.

First-Party Sets proposes to maintain a single list of related domain sets (similar to the _entities_ list) in place of Disconnect's two lists (currently one blocklist, and one allowlist). Since this proposal would require site authors to submit sets of their domains, and have a published policy in concert with other enforcement mechanisms; we hope that it will be a much more rigorous approach to the issue of supporting multi-domain sites (which are exceedingly common on the modern internet) while also bringing meaningful privacy improvements to the web.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/w3ctag/design-reviews/issues/342#issuecomment-820043023

Received on Thursday, 15 April 2021 03:45:59 UTC