RE: Poll text call: final text by 28 September from TOUBIANA, VINCENT (VINCENT) on 2012-10-02 (public-tracking@w3.org from October 2012)

From: TOUBIANA, VINCENT (VINCENT) <Vincent.Toubiana@alcatel-lucent.com>
Date: Tue, 2 Oct 2012 14:44:01 +0200
To: "ifette@google.com" <ifette@google.com>, Nicholas Doty <npdoty@w3.org>
CC: Justin Brookman <justin@cdt.org>, "public-tracking@w3.org" <public-tracking@w3.org>
Message-ID: <4D30AC7C2C82C64580A0E798A171B4445AD94F2ABE@FRMRSSXCHMBSD1.dc-m.alcatel-lucent.c>
If all you say is essentially "You may keep data for six weeks for the purposes of accomplishing permitted uses" then I don't get what the purpose is, it doesn't seem to make anything either easier or harder for implementers, indeed it seems like a no-op.

My understanding was that this 6 weeks period allows service providers to filter log entries that have DNT:1 so they can put them in a different bucket and process the remaining entries as they usually do.

Going back to waaay earlier discussions, my original intent was to make it easier for people to claim compliance. I guess the analogue would be changing from a presumption of "innocence" to a presumption of "guilt". That is, in the six week period, compliance with the spec should mean that you don't do <insert super aggregious thing here, such as transferring all data to a third party. There's a presumption you're not doing this, and as long as that remains true, you're fine. Since standard logging is (by definition) a standard practice, we aren't going out of the way to make you prove what practices you do or don't do, as long as you don't do X you're good.

I believe an exhaustive list of “what should be prohibited” would be too long and could be subject to miss-interpretations. I think it might be better to have a list of authorized uses even if such uses are broadly defined.

Long-term data retention has much higher risks in terms of exposure to actual privacy problems (data breach, or secondary uses that users may view as harmful to their privacy desires). As such, if you retain data for a longer term (>6wks) then you have a higher responsibility, and the burden shifts to you to show that the data is being maintained securely, and that access to the data is well controlled and in accordance with the permitted uses.

That was, and is, my goal.

-Ian
Thank you,

Vincent

On Tue, Oct 2, 2012 at 4:45 AM, Nicholas Doty <npdoty@w3.org<mailto:npdoty@w3.org>> wrote:
Hi Justin & Ian,

On Oct 1, 2012, at 6:48 AM, Justin Brookman <justin@cdt.org<mailto:justin@cdt.org>> wrote:


Setting aside the question of what the time period should be . . .

Nick, my edit was suggested in part to address the aggregate reporting issue given that it seems like there is momentum toward removing it as a dedicated permitted use.  There is a separate debate about what constitutes unlinking after 6 weeks, but my language would allow the use of log data to generate aggregate reports without unlinking during the initial 6 weeks after collection.

I had intended "During this time, operators may render data unlinkable" to cover those examples -- I understood generating aggregate reports as a common way to render data unlinkable (but still potentially very valuable). I'm not sure if the language read this way, but I didn't intend to indicate that data would have to be rendered unlinkable without processing it in some way, or that data would have to be rendered unlinkable before the end of the short term period.

On Sep 30, 2012, at 6:06 PM, Ian Fette (イアンフェッティ) <ifette@google.com<mailto:ifette@google.com>> wrote:

If you restrict uses to a given set in the 6 week period, then presumably you need to be able to prove data was only used during that period for those purposes. You then need all the same systems in place from time zero, so it's not clear this buys you anything at all. Changing it around to be a blacklist model during this period (you cannot do X) makes it much easier to show compliance.

I had thought the original intent here was to provide a grace period during which limits on retention would not imply. The existing permitted uses text requires that organizations not retain data not necessary for a permitted use -- this section allows short-term retention of data that would otherwise have to be deleted or rendered unlinkable immediately. I believe that buys organizations quite a lot -- you can do your minimization or data aggregation steps later, rather than in real-time. In showing compliance with restrictions on retention, a grace period for retention makes it much easier to comply and to show compliance.

In either case (limited to the permitted uses, or limited by a this-section-only-blacklist) organizations would have to prove to an auditor that they were not doing X, for some value of X. It might be that there are lots of activities that don't involve sharing data or targeting a user that you'd like organizations to be able to engage in for a shorter time period. There may be support within the group for that kind of permitted use -- a "do whatever you want with short-term data (but don't share/target with it)" as opposed to a "you have a grace period to minimize and aggregate data". As Shane points out, your proposal does switch us to providing a blacklist of uses inside of a whitelist of uses, and requires us to debate multiple such lists.
Thanks,
Nick


On 9/30/2012 7:43 PM, Nicholas Doty wrote:
On Sep 29, 2012, at 10:48 PM, Aleecia M. McDonald <aleecia@aleecia.com<mailto:aleecia@aleecia.com>> wrote:


My only question on Justin's text below is about the wording "communication to a third party" -- that suggests communication to a first party or a service provider is permissible. I think the intent is "communication to another party." If so, is that an acceptable change?

Nick, in particular: does Justin's language capture what you had intended?

I don't think so, and I'm not sure I understand the motivation behind that change. I was picking up on the suggestion from Vincent that uses during the short-term logging period would be simply making data unlinkable or any of the existing permitted uses. Creating an additional sub-list of practices allowed or prohibited during the short-term period seems like unnecessary confusion. Also, are there any additional uses that the group wishes to allow during the short-term period beyond the existing permitted uses and making data unlinkable? If so, what are those uses and why wouldn't those uses be part of the list of permitted uses?

Thanks,
Nick

On Sep 28, 2012, at 6:59 PM, Justin Brookman <jbrookman@cdt.org<mailto:jbrookman@cdt.org>> wrote:


I would expand Option 1 to say (something like):

Operators MAY retain data related to a communication in a third-party context for up to 6 weeks. During this time, operators may render data unlinkable (as described above), perform processing of the data for any of the other permitted uses, or perform any other processing that does not result in the transfer of information related to the particular user or communication to a third party, or alteration of the user's individual experience.

(I believe this is more consistent with Ian's original formulation, though it's possible he has since changed his mind.  Obviously, as David Wainburg points out, this language is contingent upon the scope permitted uses and the definition of whatever replaces unlinkable.)
________________________________
From: Aleecia M. McDonald [mailto:aleecia@aleecia.com<http://aleecia.com/>]
To: public-tracking@w3.org<mailto:public-tracking@w3.org> (public-tracking@w3.org<mailto:public-tracking@w3.org>) [mailto:public-tracking@w3.org<mailto:tracking@w3.org>]
Sent: Tue, 25 Sep 2012 18:20:57 -0500
Subject: Poll text call: final text by 28 September

>From the call on 12 September, we discussed topics where we have increasing clarity on options for permitted uses. I want to make sure we have the text right to reflect our options prior to doing a decision process with a poll calling for objections, which is responsive to Ian's feedback. We also want to move quickly, as Roy suggests.

Please propose specific alternative text if you believe that the two texts given below do not reflect the options before us by Friday, 28 September. We will briefly review these texts on the call tomorrow, just to make sure no one misses anything, and here we are on the mailing list, for those who cannot make the call.

Aleecia

-----
Log files: issue-134
----

This normative text fits into the section on Third Party Compliance, subsection 6.1.1.1, Short Term Collection and Use, <http://www.w3.org/2011/tracking-protection/drafts/tracking-compliance.html#short-term>. We will also want non-normative text, and have some suggested, but that will be clearer once we have the normative text settled. (Options for definitions of unlinkable data are in section 3.6, Unlinkable Data, <http://www.w3.org/2011/tracking-protection/drafts/tracking-compliance.html#def-unlinkable>.)

Option 1:
Operators MAY retain data related to a communication in a third-party context for up to 6 weeks. During this time, operators may render data unlinkable (as described above) or perform processing of the data for any of the other permitted uses.

Option 2:
Operators MAY retain data related to a communication in a third-party context. They MUST provide public transparency of their data retention period, which MUST have a specific time period (e.g. not infinite or indefinite.) During this time, operators may render data unlinkable (as described above) or perform processing of the data for any of the other permitted uses.
Received on Tuesday, 2 October 2012 12:45:34 UTC