RE: Proposed friendly amendments to industry draft

Rob,

We can look at a considerable amount of HIPPA text and other privacy process text that offering differing definitions of de-identified.  And while the A29WP has proposed a definition of pseudonymous – there are many that predate it (MSFT White Paper on this topic in 2008 I believe).  Yahoo! definition is more aligned with theirs and would fall in the “Red” data category in the current R-Y-G conceptual framework.

I again ask that we not try to select a single historical definition as being the definitive definition for all time – and instead focus on definitions that make sense for this standard and use them rigorously in that context.  If a previous definition works for us – great!  But let’s not be blindly bound to definitions that were developed for a different context and didn’t envision all that we are discussing in this working group.

- Shane

From: Rob van Eijk [mailto:rob@blaeu.com]
Sent: Tuesday, July 09, 2013 8:21 PM
To: Shane Wiley; David Singer
Cc: public-tracking@w3.org WG
Subject: RE: Proposed friendly amendments to industry draft


Shane,

If we stick to common definitions, we are not confusing the rest of the world. The NAI and the FTC have defined de-identified very clear. Also WP29 has defined pseudonymous data to be data about a person.

Your strategy looks similar, if not the same strategy as outlined in: https://github.com/lobbyplag/lobbyplag-data/raw/master/raw/lobby-documents/Yahoo%20on%20Pseudonymous%20Data.pdf


If we want a strong DNT, one that is meaningful, we should make the standard in line with common definitions, instead of confusing the rest of the word.

Rob

Shane Wiley <wileys@yahoo-inc.com<mailto:wileys@yahoo-inc.com>> wrote:

David,

Small correction:  Green is the "final" state - not Red.

In the industry proposal:  Red = raw, Yellow = de-identified but event linkable, Green = de-identified and un-linkable

The term de-identified has been used for many different purposes hence the issue we're having with some people falling back on uses they may have seen in other contexts and therefore having concerns.  If we stick to our own definitions and how those are leveraged within this standard, I believe we'll have less issue here.

- Shane

-----Original Message-----
From: David Singer [mailto:singer@apple.com]
Sent: Tuesday, July 09, 2013 10:49 AM
To: Shane Wiley
Cc: Rob van Eijk; public-tracking@w3.org<mailto:public-tracking@w3.org> WG
Subject: Re: Proposed friendly amendments to industry draft


On Jul 9, 2013, at 18:18 , Shane Wiley

<wileys@yahoo-inc.com<mailto:wileys@yahoo-inc.com>> wrote:

I disagree with this naming change as much of the data in the "red" zone may also be considered to be "pseudonymized".  What is critical to this conversation are definitions associated with the terms being used.

If the definition of IDENTIFICATION is: an act of identifying : the state of being identified -OR- b : evidence of identity (Marrian-Websters), then deidentification would be the opposite of this.  Or plainly - removing "evidence of identity".  While there are many ways to remove evidence of identity, I'll continue to argue the removal of operational "linkability" from identifiers meets this definition as well (as the "evidence" of the actual user/device identity has been removed).

Red State:  Data is fully identifiable (Limited Permitted Uses only -
retention rates should be s!

 hort)

Yellow State:  Data is de-identified
but linkable (Permitted Uses only - singular utility is analytics)
Green State:  Data is de-identified and de-linked (any use)

When you further layer these concepts into the definition of TRACKING, basically the pairing of a unique ID with non-affiliated site URLs, you create the foundation for the presentation I distributed to the group 2 weeks ago.

We're disagreeing on the term "de-identification" I believe more because some are still attached to the notion the de-identified data in of itself is outside the scope of DNT.  This is incorrect in the new construct and only the combination of de-identification with de-linking reaches the bar of moving outside the scope of DNT.

I hope this is clearer.  For those that don't agree with this use of de-identification, could you please articulate what real-world use or loop hole you feel this creates?  If we've appropriately contained the collection and!

  use of

data in the standard, then I'm not seeing a way to game the system (which I believe you somehow see something here that I don't).

Thank you,
Shane

I think that the point of my remark is that I am mostly concerned with data that is truly not associated with a person (their UA or device).  That's the only data that is out of scope in my mind.

My perception is that the rest of the world uses "de-identified" to mean this.  Maybe I am wrong.

I am fine with a best practices document saying that data that is NOT this strongly de-identifed should have its content reduced and its identifiability weakened as much as possible, which I think is your yellow state.

What I don't want is is to have a requirement in the document that data be de-identified to be out of scope, when we re-define de-identified to be merely your yellow state.

So, in summary:

term A, your yellow:  data that has been minimized !

 and

pseudonymized so its harder to re-identify term B, your red: data that truly no longer can be connected to anyone or their UA or device

The spec must require B for data to be out of scope.

I think I would prefer A: pseudonymized, B: de-identified

I think you have A: de-identified, B: de-linked







From: Rob van Eijk [mailto:rob@blaeu.com]
Sent: Tuesday, July 09, 2013 9:51 AM
To: David Singer; public-tracking@w3.org<mailto:public-tracking@w3.org> WG
Subject: Re: Proposed friendly amendments to industry draft


David,
I support the proposed change of wording.

s/de-identified/pseudonymized/
AND
s/de-linked/de-identified/

Rob



David Singer <singer@apple.com<mailto:singer@apple.com>> wrote:

On Jul 9, 2013, at 17:18 , Rob van Eijk

<rob@blaeu.com<mailto:rob@blaeu.com>> wrote:

I am considering to formally object to the term de-identified in the DAA proposal.

The reasoning is that it has been used as synonym with 'the data it is not about a person anymore'. We need another word.

or we need to use de-identified in the way that it is commonly used?  do we need more than one term?

If we do, I'd rather use a new term for data that is identifiable but that takes some work (or access to keys) to be so, such as pseudonymized.

So, in the DAA text, I'd change:

de-identifed (where it is defined) to pseudonymized de-linked (where
it is defined) to de-identified

and leave the req!
uirement
that data must be de-identified (in the strong sense) to be out of scope.

I am proposing to simply use the term linkable.

Rob


"Israel, Susan" <Susan_Israel@Comcast.com<mailto:Susan_Israel@Comcast.com>> wrote:
his document and how they ma!

 y be

used elsewhere, it may help to introduce the definitions by saying, "For purposes of this specification, ...."

Substantive:  To clarify one of the differences between the de-identified and de-linked categories as I understand them, it may be helpful to add language that indicates that the de-identified category permits reliance on operational controls in addition to technical controls, which I believe is consistent with the ideas Thomas Schauf presented.

Thus, the definition would read, "Data is de-identified when a party

1. has taken reasonable steps to ensure th!
at the
data cannot be reasonably re-associated or connected to a specific user, computer, or device without the use of additional data that is subject to separate and distinct technical and organizational controls to ensure such non-attribution, or wh!
en such
attribution would require a disproportionate amount of time, expense and effort; ...."

I

also support adding the audience measurement language that has been discussed and revised with  several participants and submitted by Esomar to the permitted uses section, 5.2.




Susan Israel
Comcast Cable
215.286.3239
215.767.3926 mobile
917.934.1044 NY
susan_israel@comcast.com<mailto:susan_israel@comcast.com>

This message and any attachments to it may contain PRIVILEGED AND CONFIDENTIAL ATTORNEY-CLIENT INFORMATION AND/OR ATTORNEY WORK PRODUCT exclusively for intended recipients. Please DO NOT FORWARD OR DISTRIBUTE to anyone else. If you are not an intended recipient, please cont!
act the
sender to report the error and then delete all copies of this message from your system.





David Singer
Multimedia and Software Standards, Apple Inc.


David Singer
Multimedia and Software Standards, Apple Inc.

Received on Wednesday, 10 July 2013 07:38:51 UTC