RE: issue-199

Your reasoning, as expressed by you in Santa Clara is: if cookie blocking remains outside of DNT, then aggregated scoring remains in the proposal. 
To me this reads as: aggregated scoring is out of scope for DNT. Not even a permitted use, but out of scope. This is a concern.

The thing is, Peter may want to work from the DAA proposal as a baseline, but currently I see no room for improving towards a privacy friendly DNT if the proposal remains a package. 

So the clarifying question is: how much of a package is on the table, is there any room to address the key issues/elephants in the room that are on the wiki? 

Rob


Shane Wiley <wileys@yahoo-inc.com> wrote:

>Mike,
>
>I support verifiability but am challenged with technical mechanisms to
>allow this without breaking corporate confidentiality concerns.  This
>is why I call it out as an area for future development to help build
>solutions to this unique problem.
>
>I’ve tried breaking the proposal down to the simplest form I can think
>of.  Let me know if this makes it more clear:
>
>-----
>If Tracking = ID + URLs, then Not Tracking = ID <> URL
>
>Keep ID, Remove URL = Aggregate Scoring
>Remove ID, Keep URL = De-Identification
>
>Remove ID, Remove URL = De-Identification + De-Linking  (now out of
>scope of DNT)
>-----
>
>- Shane
>
>From: Mike O'Neill [mailto:michael.oneill@baycloud.com]
>Sent: Wednesday, July 10, 2013 3:10 PM
>To: Shane Wiley
>Cc: public-tracking@w3.org
>Subject: RE: issue-199
>
>Shane,
>
>I have not missed key points, and know the DAA proposals mean continued
>profiling, just think that needs to be made clear. Perhaps you could
>give an example where applying a hash to a UID would be useful.
>
>There is not much difference between the retention of a profile based
>on algorithmically examining a web history and the actual web history
>itself. Both can be a basis for discrimination.
>
>My point about verifiability is that without it, with only
>administrative and operation controls, there will be inevitably be
>demands for intrusive regulation, which will not be good for industry.
>Verifiability is in fact quite easy to ensure if tracking is
>constrained to cookies or even localStorage, and that is all the more
>reason to rule out tracking by other means such as fingerprinting.
>
>Mike
>
>
>From: Shane Wiley [mailto:wileys@yahoo-inc.com]
>Sent: 10 July 2013 14:36
>To: Mike O'Neill
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>
>Subject: RE: issue-199
>
>Mike,
>
>Perhaps you’ve not been on the calls as I believe you’ve missed a few
>of the key points of this discussion.  I won’t be able to provide a
>full recount via email but I’ll try to hit the high points for you:
>
>
>1.       It’s understood obfuscation comes with some risk and will need
>to be bundled with operational and administrative controls to reach a
>reasonable confidence that data will not reverse engineered.  For
>example, data in the yellow state is not shared publically and/or with
>parties where you don’t feel could protect the security of its
>composition.  While we’ve agreed on transparency in this area – no one
>has requested external verifiability to date which I believe would be
>somewhat impossible as a starting point.  Perhaps something to work on
>as a future goal (I believe the EFF would also be interested in
>innovating techniques in this area – is that fair Lee?).
>
>2.       Aggregate scoring will result in a profile.  The proposal does
>not attempt to remove this concept but instead to ensure the result
>doesn’t include a user’s historical cross-site activity.  This should
>not be confused with de-identification and instead is simply another
>method to meet the goal of “not tracking”.
>
>- Shane
>
>From: Mike O'Neill [mailto:michael.oneill@baycloud.com]
>Sent: Wednesday, July 10, 2013 2:02 PM
>To: Shane Wiley
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>
>Subject: RE: issue-199
>
>Shane,
>
>As an example of why this “obfuscation” is pointless let it be a simple
>substitution cypher so my UID (which happens to be “123456”) is turned
>into “987654”. If I visit a website containing a reference to adco.com
>that server recognises me because the UID contains “123456” and builds
>up a profile about me. They apply the transform to the UID and always
>get the unique value  “987654”. which is stored in the profiling
>dataset. When I visit other websites that also contain references to
>adco.com the same process is repeated and my web activity is appended
>to the dataset, again using “987654” as a key.
>
>It makes no difference how complex  the UID transformation  is, as long
>as it is 1to1.
>
>Under the “DAA proposal” rules there is absolutely no diminution of
>adco’s ability to profile me.
>
>If another party gets hold of the dataset they can also see my profile,
>though not my original UID. If further records are shared they can be
>connected  to me by this other party because they have the same
>“987654” UID. They may not be able to connect records containing
>“123456” to me (unless they can crack the cypher or are given the key)
>but what would be the point? If they have access to those data records
>they can already profile me anyway.
>
>If activity data in the dataset, collected with my consent, contains
>other PII about me, such as my name, post code, website history etc. 
>they should obfuscate that, perhaps using one way hash functions or
>aggregated scoring algorithms. Since these datasets are a valuable
>corporate asset you would expect them to be doing that anyway, but in
>any case that is legally required in the EU.
>
>As the Snowden revelations have highlighted “operational and
>administrative controls” need to be closely monitored. In the case of
>security services this can be (has to be) through impeccable judicial
>process under democratic oversight. This would not be appropriate for
>commercial companies in a competitive environment, so transparent
>technical procedures are necessary.
>
>The “yellow” state should be recognisable to users and others though
>inspection of user agent data or web logs.
>
>Mike
>
>
>From: Shane Wiley [mailto:wileys@yahoo-inc.com]
>Sent: 10 July 2013 12:14
>To: Mike O'Neill
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>
>Subject: RE: issue-199
>
>Mike,
>
>I respectfully disagree.  Obfuscating the ID breaks the association
>with the actual user/device.  That said, I agree this has the risk of
>being reversed so a blend of technical, operational, and administrative
>controls must be brought to bear to keep this from occurring.
>
>De-identification doesn’t allow for profiling in a manner that could
>affect a user’s experience (no way to get back to the user).
>
>Do Not Track can be achieved by breaking the link between a unique ID
>and cross-site activity (URLs) – and this could result in a profile of
>the user’s interest resulting from aggregate scoring – but this would
>not allow a user’s historical activity to be retrieved.
>
>- Shane
>
>From: Mike O'Neill [mailto:michael.oneill@baycloud.com]
>Sent: Wednesday, July 10, 2013 11:55 AM
>To: Shane Wiley
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>
>Subject: RE: issue-199
>
>Hi Shane,
>
>How can it be possible to remove the association between a device and a
>UID other than deleting it or ensuring it is deleted by the UA after a
>short duration. If the UID is there (and present in every transport
>level request if it is in a cookie) it uniquely points to the device
>where it is stored or derived. This identity is available to the
>receiving server as well as any actor with similar access to the data
>stream or the same document origin.
>
>If you transform the UID in retained data by setting it to another UID
>(say by using a hash function), this does not break the association
>because there is a 1to1 mapping. There is no practical point in doing
>it.
>
>De-identified data can only be classed as such if there is no linkage.
>The “yellow” state can be imagined as an intermediate stage before
>de-identification but is only relevant for permitted uses (such as the
>detection of unique visitors for analytics or frequency capping), and
>there is no need for it to exist for more than a few hours.
>
>If we end up defining de-identified as including the ability to link
>individuals to a profile it would be a travesty, and people will see
>through it. The arms race has already started with an explosion of
>blunt cookie and script blockers. If there is not a sensible response
>to people’s real privacy concerns the usefulness of the web (and
>consequently the profitability of many business models) will be
>severely diminished.
>
>Mike
>
>
>From: Shane Wiley [mailto:wileys@yahoo-inc.com]
>Sent: 09 July 2013 19:30
>To: Mike O'Neill; 'achapell'; npdoty@w3.org<mailto:npdoty@w3.org>;
>tlr@w3.org<mailto:tlr@w3.org>
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>;
>jeff@democraticmedia.org<mailto:jeff@democraticmedia.org>
>Subject: RE: issue-199
>
>Mike,
>
>Deidentification is about removing the association between a unique ID
>(any source:  cookie, digital fingerprint, etc.) and the
>actual/specific user/device.  In this context:
>
>Red:  actual user/device
>Yellow:  not actual user/device but events are linkable (and only
>usable for analytics/reporting)
>Green:  not actual user/device and events are not linkable (outside the
>scope of DNT)
>
>- Shane
>
>From: Mike O'Neill [mailto:michael.oneill@baycloud.com]
>Sent: Sunday, June 30, 2013 3:01 PM
>To: 'achapell'; npdoty@w3.org<mailto:npdoty@w3.org>;
>tlr@w3.org<mailto:tlr@w3.org>
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>;
>jeff@democraticmedia.org<mailto:jeff@democraticmedia.org>
>Subject: RE: issue-199
>
>Alan,
>
>Persistent identifiers and their duration should be discussed as part
>of the red/yellow/green permitted use debate. Browser fingerprinting
>identifiers are qualitatively different from those stored in cookies or
>localStorage because they are effectively infinite in duration, so I
>thought it best to extend the defs. to make that clear.
>
>
>Mike
>
>
>From: achapell [mailto:achapell@chapellassociates.com]
>Sent: 30 June 2013 22:39
>To: michael.oneill@baycloud.com<mailto:michael.oneill@baycloud.com>;
>npdoty@w3.org<mailto:npdoty@w3.org>; tlr@w3.org<mailto:tlr@w3.org>
>Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>;
>jeff@democraticmedia.org<mailto:jeff@democraticmedia.org>
>Subject: RE: issue-199
>
>Do we want to specify technologies here?
>
>
>Cheers,
>
>Alan Chapell
>917 318 8440
>
>
>
>-------- Original message --------
>From: Mike O'Neill
><michael.oneill@baycloud.com<mailto:michael.oneill@baycloud.com>>
>Date: 06/30/2013 3:33 PM (GMT-05:00)
>To: Nicholas Doty <npdoty@w3.org<mailto:npdoty@w3.org>>,tlr@w3.org
>Cc:
>public-tracking@w3.org,jeff@democraticmedia.org<mailto:public-tracking@w3.org,jeff@democraticmedia.org>
>Subject: issue-199
>
>Nick, Thomas
>
>Dr Dix’s letter reminded me that we need to have some reference to
>browser fingerprinting being ruled out when DNT is set. I have amended
>the definitions accordingly.
>
>Do you want me to modify the wiki?
>
>
>
>A persistent identifier is an arbitrary value held in, or derived from
>other data in, the user agent whose purpose is to identify the user
>agent in subsequent transactions to a particular web domain. It may be
>encoded for example as the name or value attribute of an HTTP cookie,
>as an item in localStorage or recorded in some way in the cache.
>
>The duration of a persistent identifier is the maximum period of time
>it will be retained in the user agent. This could be implemented for
>example using the Expires or Max-Age attributes of an HTTP cookie so
>that it is automatically deleted by the user agent after the specified
>time period is exceeded.
>
>Browser fingerprinting is a method of tracking based on creating a
>persistent identifier from other information either inherent in the
>content request or already stored in the user agent. Such an identifier
>may not need itself to be stored in the user-agent as it can be
>calculated again in subsequent transactions. It follows from this that
>its duration is effectively unlimited.
>
>Justification.
>
>With the duration definition, restrictions on permitted uses could then
>be made that limit the duration of persistent identifiers. Because
>browser fingerprinting cannot be given a finite duration this tracking
>method should not be used when DNT is set even if it is for a permitted
>use. In reality browser fingerprinting solely based on examining
>initial content requests is usually not an effective tracking method
>because the combination of IP addresses and other headers are not
>sufficiently user specific, but we should rule out at least the more
>complex form when DNT is set.
>Mike

Received on Wednesday, 10 July 2013 15:13:39 UTC