Re: CCPA Do-Not-Sell

Dear Sebastian, all,

thank you very much for getting this conversation going. I am sorry I 
haven't been able to attend the call (I am still looking for ways to 
clear up time so I can join regularly) as this is a topic that is 
operationally important to The Times. I add some general notes below.

On 2020-03-26 12:54, Sebastian Zimmeck wrote:
> Some background (that you already may be aware of): at the beginning of 
> this year the CCPA became effective. In addition to the rights of data 
> access and deletion, this new privacy law gives consumers the right to 
> opt out from the sale of personal information. A "sale" is understood 
> broadly and likely covers, for example, a website or app disclosing 
> location data or device identifiers to an ad network for purposes of 
> monetization. Now, the most recent regulations to the CCPA 
> <https://www.oag.ca.gov/sites/all/files/agweb/pdfs/privacy/ccpa-text-of-second-set-mod-031120.pdf?> published 
> by the California Attorney General specify that automatic signals 
> communicating a user's decision to opt out must be respected. Here is 
> the relevant language:

This is a timely proposal as the CCPA Regs mandating a signal come into 
effect in July, and it seems clear that implementation is going to be 
all over the place. What's more, in addition to being relatively unclear 
about privacy signals the regulations also require the full Do Not Sell 
process to be triggered every time a signal is seen rather than just 
treat the request to which the signal is attached as operating in an 
opted out manner. While that is indisputably friendlier to user privacy, 
the scope mismatch leads to implementation headaches. A standard would 
certainly be welcome. (Note that Norm Sadeh from CMU is working on a 
similar issue for IoT.)

As we hop towards a standard, however, I think there is a couple of 
issues that we need to address, and they might not be trivial! Note that 
while I am mostly writing to list problems, it is not in the least to 
shoot this proposal down but rather to address issues early.

My first concern is with the risk of producing regulation-specific 
mechanisms. People will be paying attention to the CCPA and the GDPR 
because these are major economic areas, but will anyone account for the 
Barbados Data Protection Act of 2018? New York *City* has been known to 
consider privacy regulation (preventing the selling of precise location 
data collected inside city walls), could that be in scope?

This points to there being value in establishing a shared Web model for 
privacy that could map to regulatory regimes easily enough. At the risk 
of jinxing this group for the decade to come I'm going to stick my neck 
out there and state that I think this may not be as complex as one might 
expect!

Schematically, I get to that model through to simplifying assumptions 
(which I believe to be justified):

* Third-party data controllers are a thing of the past. The idea that we 
had third parties able to covertly observe and recognise users across 
contexts, and furthermore make independent use of that data for their 
own purposes is a bug that goes against the interests of both users and 
publishers, and to the extent that browsers supported that they were 
failing to honour the priority of constituencies. We can therefore work 
on the assumption that third-party requests, when they exist, will be 
stateless and that the entities processing data there will be service 
providers to the first party.

* Regulators can, and do, change tack. I see in the minutes GDPR 
described as an opt-in regime. That's not strictly true. A lot of people 
have been treating it as such, and regulators have unfortunately 
supported this through cookie consent requirements that actively 
encourage affordances that give users the impression that first-party 
weather preferences and third-party data broadcast to open programmatic 
present the same level of risk. But that is not a requirement of the 
GDPR itself and if the W3C were to put forth a model better aligned with 
user privacy there's a good chance they would pay attention.

Put together, these assumptions can help us define what the actual 
default is in "privacy by default". I'm keeping this at a high level 
here, details to be hashed out. But basically it's a three-tier system:

* Default Tier. Data stays entirely between the user and the first party 
(with service providers properly limited), with general adherence to 
best practices in retention, purpose limitation/context firewalling, 
profiling. Because this is safe by default (and respectful of contextual 
integrity) we get rid of cookie consent dialogs that are bad for users, 
publishers, privacy, as well arguably as songbirds and wildflower 
biodiversity.

More seriously, this default is grounded in the notion that privacy 
isn't no data, it's appropriate data. This is something that the Web 
should be opinionated about.

* Restricted Tier. Users who for whatever personal reasons are 
uncomfortable with data processing from a given first party should have 
the right to limit it. This aligns with the mechanism that Sebastian 
describes. It would need a way to make a request, probably to a 
.well-known endpoint. This would map to strict restrictions on 
processing for the first party.

* Poke-A-Hole Tier. Data should only be shared to a third-party 
controller following a clear and strong signal from the user. This has 
to be managed by the UA, otherwise you end up incentivising publishers 
to compete on the fraudulent obtention of consent as is currently the 
case in Europe. The Storage Access API or a similar mechanism in which 
the user drags their data over to the third party ought to cover this. 
This should be designed as a rare occurrence.

If there's interest in this group I would be happy to help expand on the 
above, and also on what I think Sebastian excellent idea looks like in 
this framework.

It's a pleasure to meet you all!

-- 
Robin Berjon
VP Data Governance
The New York Times Company

Received on Thursday, 9 April 2020 21:08:15 UTC