- From: Rob van Eijk <rob@blaeu.com>
- Date: Tue, 09 Jul 2013 21:20:38 +0200
- To: Shane Wiley <wileys@yahoo-inc.com>, David Singer <singer@apple.com>
- CC: "public-tracking@w3.org WG" <public-tracking@w3.org>
- Message-ID: <d19048ed-bca0-4aea-af01-02fb302bd307@email.android.com>
Shane, If we stick to common definitions, we are not confusing the rest of the world. The NAI and the FTC have defined de-identified very clear. Also WP29 has defined pseudonymous data to be data about a person. Your strategy looks similar, if not the same strategy as outlined in: https://github.com/lobbyplag/lobbyplag-data/raw/master/raw/lobby-documents/Yahoo%20on%20Pseudonymous%20Data.pdf If we want a strong DNT, one that is meaningful, we should make the standard in line with common definitions, instead of confusing the rest of the word. Rob Shane Wiley <wileys@yahoo-inc.com> wrote: >David, > >Small correction: Green is the "final" state - not Red. > >In the industry proposal: Red = raw, Yellow = de-identified but event >linkable, Green = de-identified and un-linkable > >The term de-identified has been used for many different purposes hence >the issue we're having with some people falling back on uses they may >have seen in other contexts and therefore having concerns. If we stick >to our own definitions and how those are leveraged within this >standard, I believe we'll have less issue here. > >- Shane > >-----Original Message----- >From: David Singer [mailto:singer@apple.com] >Sent: Tuesday, July 09, 2013 10:49 AM >To: Shane Wiley >Cc: Rob van Eijk; public-tracking@w3.org WG >Subject: Re: Proposed friendly amendments to industry draft > > >On Jul 9, 2013, at 18:18 , Shane Wiley <wileys@yahoo-inc.com> wrote: > >> I disagree with this naming change as much of the data in the "red" >zone may also be considered to be "pseudonymized". What is critical to >this conversation are definitions associated with the terms being used. >> >> If the definition of IDENTIFICATION is: an act of identifying : the >state of being identified -OR- b : evidence of identity >(Marrian-Websters), then deidentification would be the opposite of >this. Or plainly - removing "evidence of identity". While there are >many ways to remove evidence of identity, I'll continue to argue the >removal of operational "linkability" from identifiers meets this >definition as well (as the "evidence" of the actual user/device >identity has been removed). >> >> Red State: Data is fully identifiable (Limited Permitted Uses only - > >> retention rates should be short) Yellow State: Data is de-identified > >> but linkable (Permitted Uses only - singular utility is analytics) >> Green State: Data is de-identified and de-linked (any use) >> >> When you further layer these concepts into the definition of >TRACKING, basically the pairing of a unique ID with non-affiliated site >URLs, you create the foundation for the presentation I distributed to >the group 2 weeks ago. >> >> We're disagreeing on the term "de-identification" I believe more >because some are still attached to the notion the de-identified data in >of itself is outside the scope of DNT. This is incorrect in the new >construct and only the combination of de-identification with de-linking >reaches the bar of moving outside the scope of DNT. >> >> I hope this is clearer. For those that don't agree with this use of >de-identification, could you please articulate what real-world use or >loop hole you feel this creates? If we've appropriately contained the >collection and use of data in the standard, then I'm not seeing a way >to game the system (which I believe you somehow see something here that >I don't). >> >> Thank you, >> Shane > >I think that the point of my remark is that I am mostly concerned with >data that is truly not associated with a person (their UA or device). >That's the only data that is out of scope in my mind. > >My perception is that the rest of the world uses "de-identified" to >mean this. Maybe I am wrong. > >I am fine with a best practices document saying that data that is NOT >this strongly de-identifed should have its content reduced and its >identifiability weakened as much as possible, which I think is your >yellow state. > >What I don't want is is to have a requirement in the document that data >be de-identified to be out of scope, when we re-define de-identified to >be merely your yellow state. > >So, in summary: > >term A, your yellow: data that has been minimized and pseudonymized so >its harder to re-identify term B, your red: data that truly no longer >can be connected to anyone or their UA or device > >The spec must require B for data to be out of scope. > >I think I would prefer A: pseudonymized, B: de-identified > >I think you have A: de-identified, B: de-linked > > > > > >> >> >> From: Rob van Eijk [mailto:rob@blaeu.com] >> Sent: Tuesday, July 09, 2013 9:51 AM >> To: David Singer; public-tracking@w3.org WG >> Subject: Re: Proposed friendly amendments to industry draft >> >> >> David, >> I support the proposed change of wording. >> >> s/de-identified/pseudonymized/ >> AND >> s/de-linked/de-identified/ >> >> Rob >> >> >> >> David Singer <singer@apple.com> wrote: >> >> On Jul 9, 2013, at 17:18 , Rob van Eijk <rob@blaeu.com> wrote: >> >> I am considering to formally object to the term de-identified in the >DAA proposal. >> >> The reasoning is that it has been used as synonym with 'the data it >is not about a person anymore'. We need another word. >> >> or we need to use de-identified in the way that it is commonly used? >do we need more than one term? >> >> If we do, I'd rather use a new term for data that is identifiable but >that takes some work (or access to keys) to be so, such as >pseudonymized. >> >> So, in the DAA text, I'd change: >> >> de-identifed (where it is defined) to pseudonymized de-linked (where >> it is defined) to de-identified >> >> and leave the req! >> uirement >> that data must be de-identified (in the strong sense) to be out of >scope. >> >> I am proposing to simply use the term linkable. >> >> Rob >> >> >> "Israel, Susan" <Susan_Israel@Comcast.com> wrote: >> his document and how they may be used elsewhere, it may help to >introduce the definitions by saying, "For purposes of this >specification, ...." >> >> Substantive: To clarify one of the differences between the >de-identified and de-linked categories as I understand them, it may be >helpful to add language that indicates that the de-identified category >permits reliance on operational controls in addition to technical >controls, which I believe is consistent with the ideas Thomas Schauf >presented. >> >> Thus, the definition would read, "Data is de-identified when a party >> >> 1. has taken reasonable steps to ensure th! >> at the >> data cannot be reasonably re-associated or connected to a specific >user, computer, or device without the use of additional data that is >subject to separate and distinct technical and organizational controls >to ensure such non-attribution, or wh! >> en such >> attribution would require a disproportionate amount of time, expense >and effort; ...." >> >> >> I also support adding the audience measurement language that has been >discussed and revised with several participants and submitted by >Esomar to the permitted uses section, 5.2. >> >> >> >> >> Susan Israel >> Comcast Cable >> 215.286.3239 >> 215.767.3926 mobile >> 917.934.1044 NY >> susan_israel@comcast.com >> >> This message and any attachments to it may contain PRIVILEGED AND >CONFIDENTIAL ATTORNEY-CLIENT INFORMATION AND/OR ATTORNEY WORK PRODUCT >exclusively for intended recipients. Please DO NOT FORWARD OR >DISTRIBUTE to anyone else. If you are not an intended recipient, please >cont! >> act the >> sender to report the error and then delete all copies of this message >from your system. >> >> >> >> >> >> David Singer >> Multimedia and Software Standards, Apple Inc. >> > >David Singer >Multimedia and Software Standards, Apple Inc.
Received on Tuesday, 9 July 2013 19:21:18 UTC