Re: Proposed friendly amendments to industry draft

On Jul 9, 2013, at 19:00 , Shane Wiley <wileys@yahoo-inc.com> wrote:

> David,
> 
> Small correction:  Green is the "final" state - not Red.

whoops, sorry, stupid mistake on my part.  term B should be green!!

> 
> In the industry proposal:  Red = raw, Yellow = de-identified but event linkable, Green = de-identified and un-linkable
> 
> The term de-identified has been used for many different purposes hence the issue we're having with some people falling back on uses they may have seen in other contexts and therefore having concerns.  If we stick to our own definitions and how those are leveraged within this standard, I believe we'll have less issue here.

OK, I am all for consistency and clarity.


> 
> - Shane
> 
> -----Original Message-----
> From: David Singer [mailto:singer@apple.com] 
> Sent: Tuesday, July 09, 2013 10:49 AM
> To: Shane Wiley
> Cc: Rob van Eijk; public-tracking@w3.org WG
> Subject: Re: Proposed friendly amendments to industry draft
> 
> 
> On Jul 9, 2013, at 18:18 , Shane Wiley <wileys@yahoo-inc.com> wrote:
> 
>> I disagree with this naming change as much of the data in the "red" zone may also be considered to be "pseudonymized".  What is critical to this conversation are definitions associated with the terms being used.
>> 
>> If the definition of IDENTIFICATION is: an act of identifying : the state of being identified -OR- b : evidence of identity (Marrian-Websters), then deidentification would be the opposite of this.  Or plainly - removing "evidence of identity".  While there are many ways to remove evidence of identity, I'll continue to argue the removal of operational "linkability" from identifiers meets this definition as well (as the "evidence" of the actual user/device identity has been removed).
>> 
>> Red State:  Data is fully identifiable (Limited Permitted Uses only - 
>> retention rates should be short) Yellow State:  Data is de-identified 
>> but linkable (Permitted Uses only - singular utility is analytics) 
>> Green State:  Data is de-identified and de-linked (any use)
>> 
>> When you further layer these concepts into the definition of TRACKING, basically the pairing of a unique ID with non-affiliated site URLs, you create the foundation for the presentation I distributed to the group 2 weeks ago.
>> 
>> We're disagreeing on the term "de-identification" I believe more because some are still attached to the notion the de-identified data in of itself is outside the scope of DNT.  This is incorrect in the new construct and only the combination of de-identification with de-linking reaches the bar of moving outside the scope of DNT.
>> 
>> I hope this is clearer.  For those that don't agree with this use of de-identification, could you please articulate what real-world use or loop hole you feel this creates?  If we've appropriately contained the collection and use of data in the standard, then I'm not seeing a way to game the system (which I believe you somehow see something here that I don't).
>> 
>> Thank you,
>> Shane
> 
> I think that the point of my remark is that I am mostly concerned with data that is truly not associated with a person (their UA or device).  That's the only data that is out of scope in my mind.
> 
> My perception is that the rest of the world uses "de-identified" to mean this.  Maybe I am wrong.
> 
> I am fine with a best practices document saying that data that is NOT this strongly de-identifed should have its content reduced and its identifiability weakened as much as possible, which I think is your yellow state.
> 
> What I don't want is is to have a requirement in the document that data be de-identified to be out of scope, when we re-define de-identified to be merely your yellow state.
> 
> So, in summary:
> 
> term A, your yellow:  data that has been minimized and pseudonymized so its harder to re-identify term B, your red: data that truly no longer can be connected to anyone or their UA or device
> 
> The spec must require B for data to be out of scope.
> 
> I think I would prefer A: pseudonymized, B: de-identified
> 
> I think you have A: de-identified, B: de-linked
> 
> 
> 
> 
> 
>> 
>> 
>> From: Rob van Eijk [mailto:rob@blaeu.com]
>> Sent: Tuesday, July 09, 2013 9:51 AM
>> To: David Singer; public-tracking@w3.org WG
>> Subject: Re: Proposed friendly amendments to industry draft
>> 
>> 
>> David,
>> I support the proposed change of wording.
>> 
>> s/de-identified/pseudonymized/
>> AND
>> s/de-linked/de-identified/
>> 
>> Rob
>> 
>> 
>> 
>> David Singer <singer@apple.com> wrote:
>> 
>> On Jul 9, 2013, at 17:18 , Rob van Eijk <rob@blaeu.com> wrote:
>> 
>> I am considering to formally object to the term de-identified in the DAA proposal.
>> 
>> The reasoning is that it has been used as synonym with 'the data it is not about a person anymore'. We need another word. 
>> 
>> or we need to use de-identified in the way that it is commonly used?  do we need more than one term?
>> 
>> If we do, I'd rather use a new term for data that is identifiable but that takes some work (or access to keys) to be so, such as pseudonymized.
>> 
>> So, in the DAA text, I'd change:
>> 
>> de-identifed (where it is defined) to pseudonymized de-linked (where 
>> it is defined) to de-identified
>> 
>> and leave the req!
>> uirement
>> that data must be de-identified (in the strong sense) to be out of scope.
>> 
>> I am proposing to simply use the term linkable.
>> 
>> Rob
>> 
>> 
>> "Israel, Susan" <Susan_Israel@Comcast.com> wrote:
>> his document and how they may be used elsewhere, it may help to introduce the definitions by saying, "For purposes of this specification, ...." 
>> 
>> Substantive:  To clarify one of the differences between the de-identified and de-linked categories as I understand them, it may be helpful to add language that indicates that the de-identified category permits reliance on operational controls in addition to technical controls, which I believe is consistent with the ideas Thomas Schauf presented.  
>> 
>> Thus, the definition would read, "Data is de-identified when a party
>> 
>> 1. has taken reasonable steps to ensure th!
>> at the
>> data cannot be reasonably re-associated or connected to a specific user, computer, or device without the use of additional data that is subject to separate and distinct technical and organizational controls to ensure such non-attribution, or wh!
>> en such
>> attribution would require a disproportionate amount of time, expense and effort; ...." 
>> 
>> 
>> I also support adding the audience measurement language that has been discussed and revised with  several participants and submitted by Esomar to the permitted uses section, 5.2. 
>> 
>> 
>> 
>> 
>> Susan Israel
>> Comcast Cable
>> 215.286.3239
>> 215.767.3926 mobile
>> 917.934.1044 NY
>> susan_israel@comcast.com
>> 
>> This message and any attachments to it may contain PRIVILEGED AND CONFIDENTIAL ATTORNEY-CLIENT INFORMATION AND/OR ATTORNEY WORK PRODUCT exclusively for intended recipients. Please DO NOT FORWARD OR DISTRIBUTE to anyone else. If you are not an intended recipient, please cont!
>> act the
>> sender to report the error and then delete all copies of this message from your system.
>> 
>> 
>> 
>> 
>> 
>> David Singer
>> Multimedia and Software Standards, Apple Inc.
>> 
> 
> David Singer
> Multimedia and Software Standards, Apple Inc.
> 

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Tuesday, 9 July 2013 18:42:35 UTC