- From: Rob van Eijk <rob@blaeu.com>
- Date: Wed, 10 Jul 2013 17:13:00 +0200
- To: Shane Wiley <wileys@yahoo-inc.com>, "Mike O'Neill" <michael.oneill@baycloud.com>
- CC: "public-tracking@w3.org" <public-tracking@w3.org>
- Message-ID: <81edd473-ec05-4416-8688-3fc55c6e29fe@email.android.com>
Your reasoning, as expressed by you in Santa Clara is: if cookie blocking remains outside of DNT, then aggregated scoring remains in the proposal. To me this reads as: aggregated scoring is out of scope for DNT. Not even a permitted use, but out of scope. This is a concern. The thing is, Peter may want to work from the DAA proposal as a baseline, but currently I see no room for improving towards a privacy friendly DNT if the proposal remains a package. So the clarifying question is: how much of a package is on the table, is there any room to address the key issues/elephants in the room that are on the wiki? Rob Shane Wiley <wileys@yahoo-inc.com> wrote: >Mike, > >I support verifiability but am challenged with technical mechanisms to >allow this without breaking corporate confidentiality concerns. This >is why I call it out as an area for future development to help build >solutions to this unique problem. > >I’ve tried breaking the proposal down to the simplest form I can think >of. Let me know if this makes it more clear: > >----- >If Tracking = ID + URLs, then Not Tracking = ID <> URL > >Keep ID, Remove URL = Aggregate Scoring >Remove ID, Keep URL = De-Identification > >Remove ID, Remove URL = De-Identification + De-Linking (now out of >scope of DNT) >----- > >- Shane > >From: Mike O'Neill [mailto:michael.oneill@baycloud.com] >Sent: Wednesday, July 10, 2013 3:10 PM >To: Shane Wiley >Cc: public-tracking@w3.org >Subject: RE: issue-199 > >Shane, > >I have not missed key points, and know the DAA proposals mean continued >profiling, just think that needs to be made clear. Perhaps you could >give an example where applying a hash to a UID would be useful. > >There is not much difference between the retention of a profile based >on algorithmically examining a web history and the actual web history >itself. Both can be a basis for discrimination. > >My point about verifiability is that without it, with only >administrative and operation controls, there will be inevitably be >demands for intrusive regulation, which will not be good for industry. >Verifiability is in fact quite easy to ensure if tracking is >constrained to cookies or even localStorage, and that is all the more >reason to rule out tracking by other means such as fingerprinting. > >Mike > > >From: Shane Wiley [mailto:wileys@yahoo-inc.com] >Sent: 10 July 2013 14:36 >To: Mike O'Neill >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org> >Subject: RE: issue-199 > >Mike, > >Perhaps you’ve not been on the calls as I believe you’ve missed a few >of the key points of this discussion. I won’t be able to provide a >full recount via email but I’ll try to hit the high points for you: > > >1. It’s understood obfuscation comes with some risk and will need >to be bundled with operational and administrative controls to reach a >reasonable confidence that data will not reverse engineered. For >example, data in the yellow state is not shared publically and/or with >parties where you don’t feel could protect the security of its >composition. While we’ve agreed on transparency in this area – no one >has requested external verifiability to date which I believe would be >somewhat impossible as a starting point. Perhaps something to work on >as a future goal (I believe the EFF would also be interested in >innovating techniques in this area – is that fair Lee?). > >2. Aggregate scoring will result in a profile. The proposal does >not attempt to remove this concept but instead to ensure the result >doesn’t include a user’s historical cross-site activity. This should >not be confused with de-identification and instead is simply another >method to meet the goal of “not tracking”. > >- Shane > >From: Mike O'Neill [mailto:michael.oneill@baycloud.com] >Sent: Wednesday, July 10, 2013 2:02 PM >To: Shane Wiley >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org> >Subject: RE: issue-199 > >Shane, > >As an example of why this “obfuscation” is pointless let it be a simple >substitution cypher so my UID (which happens to be “123456”) is turned >into “987654”. If I visit a website containing a reference to adco.com >that server recognises me because the UID contains “123456” and builds >up a profile about me. They apply the transform to the UID and always >get the unique value “987654”. which is stored in the profiling >dataset. When I visit other websites that also contain references to >adco.com the same process is repeated and my web activity is appended >to the dataset, again using “987654” as a key. > >It makes no difference how complex the UID transformation is, as long >as it is 1to1. > >Under the “DAA proposal” rules there is absolutely no diminution of >adco’s ability to profile me. > >If another party gets hold of the dataset they can also see my profile, >though not my original UID. If further records are shared they can be >connected to me by this other party because they have the same >“987654” UID. They may not be able to connect records containing >“123456” to me (unless they can crack the cypher or are given the key) >but what would be the point? If they have access to those data records >they can already profile me anyway. > >If activity data in the dataset, collected with my consent, contains >other PII about me, such as my name, post code, website history etc. >they should obfuscate that, perhaps using one way hash functions or >aggregated scoring algorithms. Since these datasets are a valuable >corporate asset you would expect them to be doing that anyway, but in >any case that is legally required in the EU. > >As the Snowden revelations have highlighted “operational and >administrative controls” need to be closely monitored. In the case of >security services this can be (has to be) through impeccable judicial >process under democratic oversight. This would not be appropriate for >commercial companies in a competitive environment, so transparent >technical procedures are necessary. > >The “yellow” state should be recognisable to users and others though >inspection of user agent data or web logs. > >Mike > > >From: Shane Wiley [mailto:wileys@yahoo-inc.com] >Sent: 10 July 2013 12:14 >To: Mike O'Neill >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org> >Subject: RE: issue-199 > >Mike, > >I respectfully disagree. Obfuscating the ID breaks the association >with the actual user/device. That said, I agree this has the risk of >being reversed so a blend of technical, operational, and administrative >controls must be brought to bear to keep this from occurring. > >De-identification doesn’t allow for profiling in a manner that could >affect a user’s experience (no way to get back to the user). > >Do Not Track can be achieved by breaking the link between a unique ID >and cross-site activity (URLs) – and this could result in a profile of >the user’s interest resulting from aggregate scoring – but this would >not allow a user’s historical activity to be retrieved. > >- Shane > >From: Mike O'Neill [mailto:michael.oneill@baycloud.com] >Sent: Wednesday, July 10, 2013 11:55 AM >To: Shane Wiley >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org> >Subject: RE: issue-199 > >Hi Shane, > >How can it be possible to remove the association between a device and a >UID other than deleting it or ensuring it is deleted by the UA after a >short duration. If the UID is there (and present in every transport >level request if it is in a cookie) it uniquely points to the device >where it is stored or derived. This identity is available to the >receiving server as well as any actor with similar access to the data >stream or the same document origin. > >If you transform the UID in retained data by setting it to another UID >(say by using a hash function), this does not break the association >because there is a 1to1 mapping. There is no practical point in doing >it. > >De-identified data can only be classed as such if there is no linkage. >The “yellow” state can be imagined as an intermediate stage before >de-identification but is only relevant for permitted uses (such as the >detection of unique visitors for analytics or frequency capping), and >there is no need for it to exist for more than a few hours. > >If we end up defining de-identified as including the ability to link >individuals to a profile it would be a travesty, and people will see >through it. The arms race has already started with an explosion of >blunt cookie and script blockers. If there is not a sensible response >to people’s real privacy concerns the usefulness of the web (and >consequently the profitability of many business models) will be >severely diminished. > >Mike > > >From: Shane Wiley [mailto:wileys@yahoo-inc.com] >Sent: 09 July 2013 19:30 >To: Mike O'Neill; 'achapell'; npdoty@w3.org<mailto:npdoty@w3.org>; >tlr@w3.org<mailto:tlr@w3.org> >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>; >jeff@democraticmedia.org<mailto:jeff@democraticmedia.org> >Subject: RE: issue-199 > >Mike, > >Deidentification is about removing the association between a unique ID >(any source: cookie, digital fingerprint, etc.) and the >actual/specific user/device. In this context: > >Red: actual user/device >Yellow: not actual user/device but events are linkable (and only >usable for analytics/reporting) >Green: not actual user/device and events are not linkable (outside the >scope of DNT) > >- Shane > >From: Mike O'Neill [mailto:michael.oneill@baycloud.com] >Sent: Sunday, June 30, 2013 3:01 PM >To: 'achapell'; npdoty@w3.org<mailto:npdoty@w3.org>; >tlr@w3.org<mailto:tlr@w3.org> >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>; >jeff@democraticmedia.org<mailto:jeff@democraticmedia.org> >Subject: RE: issue-199 > >Alan, > >Persistent identifiers and their duration should be discussed as part >of the red/yellow/green permitted use debate. Browser fingerprinting >identifiers are qualitatively different from those stored in cookies or >localStorage because they are effectively infinite in duration, so I >thought it best to extend the defs. to make that clear. > > >Mike > > >From: achapell [mailto:achapell@chapellassociates.com] >Sent: 30 June 2013 22:39 >To: michael.oneill@baycloud.com<mailto:michael.oneill@baycloud.com>; >npdoty@w3.org<mailto:npdoty@w3.org>; tlr@w3.org<mailto:tlr@w3.org> >Cc: public-tracking@w3.org<mailto:public-tracking@w3.org>; >jeff@democraticmedia.org<mailto:jeff@democraticmedia.org> >Subject: RE: issue-199 > >Do we want to specify technologies here? > > >Cheers, > >Alan Chapell >917 318 8440 > > > >-------- Original message -------- >From: Mike O'Neill ><michael.oneill@baycloud.com<mailto:michael.oneill@baycloud.com>> >Date: 06/30/2013 3:33 PM (GMT-05:00) >To: Nicholas Doty <npdoty@w3.org<mailto:npdoty@w3.org>>,tlr@w3.org >Cc: >public-tracking@w3.org,jeff@democraticmedia.org<mailto:public-tracking@w3.org,jeff@democraticmedia.org> >Subject: issue-199 > >Nick, Thomas > >Dr Dix’s letter reminded me that we need to have some reference to >browser fingerprinting being ruled out when DNT is set. I have amended >the definitions accordingly. > >Do you want me to modify the wiki? > > > >A persistent identifier is an arbitrary value held in, or derived from >other data in, the user agent whose purpose is to identify the user >agent in subsequent transactions to a particular web domain. It may be >encoded for example as the name or value attribute of an HTTP cookie, >as an item in localStorage or recorded in some way in the cache. > >The duration of a persistent identifier is the maximum period of time >it will be retained in the user agent. This could be implemented for >example using the Expires or Max-Age attributes of an HTTP cookie so >that it is automatically deleted by the user agent after the specified >time period is exceeded. > >Browser fingerprinting is a method of tracking based on creating a >persistent identifier from other information either inherent in the >content request or already stored in the user agent. Such an identifier >may not need itself to be stored in the user-agent as it can be >calculated again in subsequent transactions. It follows from this that >its duration is effectively unlimited. > >Justification. > >With the duration definition, restrictions on permitted uses could then >be made that limit the duration of persistent identifiers. Because >browser fingerprinting cannot be given a finite duration this tracking >method should not be used when DNT is set even if it is for a permitted >use. In reality browser fingerprinting solely based on examining >initial content requests is usually not an effective tracking method >because the combination of IP addresses and other headers are not >sufficiently user specific, but we should rule out at least the more >complex form when DNT is set. >Mike
Received on Wednesday, 10 July 2013 15:13:39 UTC