W3C home > Mailing lists > Public > public-tracking@w3.org > June 2012

Updated Proposal - Outline in preparation for presentation in Seattle

From: Shane Wiley <wileys@yahoo-inc.com>
Date: Mon, 11 Jun 2012 19:25:41 -0700
To: "public-tracking@w3.org" <public-tracking@w3.org>
Message-ID: <63294A1959410048A33AEE161379C8023D18786247@SP2-EX07VS02.ds.corp.yahoo.com>
Hello TPWG,

Due to "recent activities" I'm a bit behind on providing the final presentation for our updated proposal in preparation for Seattle.  We'll be reviewing this in more detail in Seattle but I wanted to share some of the initial elements up-front so we have time as a working group to begin discussion and consider perspectives leading up to the meeting.

------

Goal:  Evolve DC proposal to bridge the divide with the advocate proposal and set a final recommendation for these elements


         Definition of First Party

o   Advocate Position:  Common Branding

o   Industry Position:  Affiliate

o   Concession Proposal:  Affiliate with "easy discoverability" ("Affiliate List" within one click from each page or owner clearly identified within one click from each page.  For example, a link in the privacy policy would meet this requirement.)


         Permitted Uses

o   Advocate Position:  Unlinkable Data w/ arbitrary "grace period"

o   Industry Position:  Enumerated uses, broadly scoped, general data minimization

o   Concession Proposal:  Tightened up permitted uses, narrowly and strictly scoped, data minimization focus with required transparency, reasonable safeguards, defined unlinkable (highlighting this moves resulting data outside of scope)



         For All Permitted Uses

o   What won't occur:  Outside of Security, all other permitted uses will not allow for altering a specific user's online experience (no profiling, no further alteration to the user experience base on profiled information)

o   Data Minimization:  Each organization engaging in Permitted Uses and claiming W3C DNT compliance, must provide public transparency of their data retention period (may enumerate each individually if they vary across Permitted Uses)

o   Reasonable Safeguards:  Reasonable technical and organizational safeguards to prevent further processing:  collection limitations, data siloing, authorization restrictions, k-anonymity, unlinkability, retention time, anonymization, pseudonymization, and/or data encryption.



         Permitted Uses:  Security/Fraud, Financial Logging/Auditing, Frequency Capping, Debugging, Aggregate Reporting*

o   For each Permitted Use:

  (Normative) Detailed, singular business purpose description

  (Non-normative) Will explain why the processing with identifiers is proportionate
*NOTE - Aggregate Reporting covers general analytics needs, product improvement, and market research uses



         Explicit and Separate User Choice

o   User must expressly activate DNT signal (TPWG already agreed on this point)

o   Servers may respond to users that their UA is "invalid" if they believe this to be the case (on the hook to defend this position)

o   Efforts to misled users to activate DNT will be seen as "invalid"


         With this Proposal

o   Users gain a consistent, local tool to communicate their opt-out preference (avoids property specific opt-out pages)

o   The users choice is persistent for each device/UA (avoids accidental deletion)

o   Outside of Security purposes, the user will no longer experience alterations to their online experiences derived from multi-site activity

o   Only minimal data is retained for necessary business operations and retention periods are transparent to users

o   All "harms" are removed (outside of government intrusion risk where there are no documented cases of this occurring with 3rd party anonymous log file data)



         Unlinkability


<Normative>



Un-linkable Data is outside of the scope of the Tracking Preference standard as information is no longer reasonably linked to a particular user, user agent, or device.



Definition:  A dataset is un-linkable when reasonable steps have been taken to modify data such that there is confidence that it contains only information which could not be linked to a particular user, user agent, or device.



<Non-Normative>



There are many valid and technically appropriate methods to de-identify or render a data set "un-linkable".  In all cases, there should be confidence the information is unable to be reverse engineering back to a "linkable" state.  Many tests could be applied to help determine the confidence level of the un-linking process.  For example, a k-anonymous test could be leveraged to determine if the mean population resulting from a de-linking exercise meets an appropriate threshold (a high-bar k-anonymous threshold would be 1024).



As there are many possible tests, it is recommended that companies publically stating W3C Tracking Preference compliance provide transparency to their delinking process so external experts and auditors can assess if they feel this steps are reasonable given the risk of a particular dataset.



         Information That Is Un-linkable When Collected:  A third party may collect non-protocol information if it is, independent of protocol information, un-linkable data. The data may be retained and used subject to the same limitations as protocol information.



Example: Example Advertising sets a language preference cookie that takes on few values and is shared by many users.



         Information That Is Un-linkable After Aggregation:  During the period in which a third party may use protocol information for any purpose, it may aggregate protocol information and un-linkable data into an un-linkable dataset. Such a dataset may be retained indefinitely and used for any purpose.



Example: Example Advertising maintains a dataset of how many times per week Italy-based users load an ad on Example News.



         Information That Is Un-linkable After Anonymization:  At some point after collection, a unique ID from a product cookie has a one-way salted hash applied to the identifier to break any connection between the resulting dataset and production identifiers.  To further remove dictionary attacks on this method, its recommended that "keys" are rotated on a regular basis.
Received on Tuesday, 12 June 2012 02:26:41 UTC

This archive was generated by hypermail 2.3.1 : Friday, 21 June 2013 10:11:30 UTC