- From: Edward W. Felten <felten@CS.Princeton.EDU>
- Date: Wed, 13 Mar 2013 13:21:41 -0400
- To: Justin Brookman <justin@cdt.org>
- Cc: "<public-tracking@w3.org>" <public-tracking@w3.org>
- Message-ID: <CANZBoGh_G4tc-Uc4dxZ4gKVVbng91G79eRjbhVu-PXrjde3yPw@mail.gmail.com>
But we should be equally clear that "de-identify" means more than just removing the most obvious identifiers from the data. On Wed, Mar 13, 2013 at 1:07 PM, Justin Brookman <justin@cdt.org> wrote: > Shane is right that we did choose to use "deidentified" instead of > "unlinkable" at the Cambridge meeting. So I agree we probably should not > use "unlinkable" to define "deidentified" in the standard. However, I > don't see why we need to define "unlinkable" at all, as it has no > operational meaning, and was rejected because it implied a technological > impossibility of relinking, which is not a standard that can be reasonably > achieved. > > Justin Brookman > Director, Consumer Privacy > Center for Democracy & Technology > tel 202.407.8812 > justin@cdt.org > http://www.cdt.org > @JustinBrookman > @CenDemTech > > > On 3/13/2013 11:35 AM, Shane Wiley wrote: > >> Rob, >> >> So we're agreed unlinkability requires more processing than de-identified >> - good. I would recommend we define de-identified (nearly done) and >> unlinkability separately to clearly demonstrate they are different points >> within a continuum. We can then focus on the discussion of retention of >> data in its de-identified state prior to moving to the ultimate unlinkable >> state. >> >> - Shane >> >> -----Original Message----- >> From: Rob van Eijk [mailto:rob@blaeu.com] >> Sent: Wednesday, March 13, 2013 8:28 AM >> To: Shane Wiley >> Cc: public-tracking@w3.org >> Subject: RE: ACTION-371: text defining de-identified data >> >> Hi Shane, >> >> I hear you and understand your position. But unlinkable and de-identified >> are not mutual exclusive. Unlinkable data is a subset of de-identified >> data, they just go through another step of scrubbing). >> Adding it to the list is not hurting your position. >> >> The key towards the middle ground remains data retention, which has to be >> proportionate to the purpose. >> >> Rob >> >> Shane Wiley schreef op 2013-03-13 16:13: >> >>> Rob, >>> >>> I thought we had agreed to not mix the "unlinkable" term with >>> "de-identified" here. In our discussions in Boston it appeared there >>> was general agreement that unlinkability in a step beyond >>> de-identified. Once a record has been rendered de-identified, it can >>> later further be made unlinkable (using your definition of unlinkable >>> vs. the one I proposed). This is a significant sticking point for >>> those of use attempting to find middle-ground here so hopefully we can >>> document the details in non-normative text but I'd ask that we remove >>> mention of unlinkable in the definition of de-identified at this time >>> (or else we've not really moved forward in this discussion in my >>> opinion). >>> >>> - Shane >>> >>> -----Original Message----- >>> From: Rob van Eijk [mailto:rob@blaeu.com] >>> Sent: Wednesday, March 13, 2013 5:57 AM >>> To: public-tracking@w3.org >>> Subject: RE: ACTION-371: text defining de-identified data >>> >>> Dan, Kevin, >>> >>> I would really want the unlinkability in there as well. I propose to >>> add the text: made unlinkable >>> >>> Normative text: Data can be considered sufficiently de-identified to >>> the extent that it has been deleted, made unlinkable, modified, >>> aggregated, anonymized or otherwise manipulated in order to achieve a >>> reasonable level of justified confidence that the data cannot >>> reasonably be used to infer information about, or otherwise be linked >>> to, a particular user, user agent, computer or device. >>> >>> >>> In terms of privacy by design, de-identification through unlinkability >>> is the strongest form of de-identtification IMHO. >>> >>> Rob >>> >>> Kevin Kiley schreef op 2013-03-12 19:03: >>> >>>> Dan, >>>> >>>> In case I wasn't being clear in my last post, I (personally) believe >>>> that >>>> >>>> User-agent should *NOT* be removed from the proposed text. >>>> >>>> I actually don't think it would do any harm to *ADD* the word >>>> 'Computer' >>>> >>>> as well ( which is present in the current FTC definition ) so it >>>> reads like this… >>>> >>>> Normative text: >>>> >>>> Data can be considered sufficiently de-identified to the extent that >>>> it >>>> >>>> has been deleted, modified, aggregated, anonymized or otherwise >>>> >>>> manipulated in order to achieve a reasonable level of justified >>>> >>>> confidence that the data cannot reasonably be used to infer >>>> information >>>> >>>> about, or otherwise be linked to, a particular user, user agent, >>>> computer or device. >>>> >>>> I think that covers it pretty well, and *NO* 'clarifying text' is >>>> necessary. >>>> >>>> Just my 2 cents. >>>> >>>> Kevin Kiley >>>> >>>> Previous message(s)… >>>> >>>> Dan, >>>> >>>> Perhaps you can add text clarifying this perspective or, much like >>>> the FTC, suffice with "device" which I believe more than covers what >>>> you're looking for here. >>>> >>>> - Shane >>>> >>>> From: Dan Auerbach [mailto:dan@eff.org] >>>> >>>> Sent: Tuesday, March 12, 2013 8:57 AM >>>> >>>> To: public-tracking@w3.org >>>> >>>> Subject: Re: ACTION-371: text defining de-identified data >>>> >>>> Shane and Kevin -- The phrase "user agent" in the text is intended to >>>> refer to a particular user agent (not "Chrome 26" but rather "the >>>> browser running on Dan's laptop". I hoped that would be clear from >>>> context, but if it's not we can clarify. I may not be able to >>>> identify your device per se, but can identify that this is the same >>>> browser as I saw before. I think this is the case with using cookies, >>>> for example. It seems more accurate to me than lumping it all under >>>> "device", and appropriate since the text of our document is elsewhere >>>> focused on user agents, unlike the FTC text. >>>> >>>> Best, >>>> >>>> Dan >>>> >>>> On 03/12/2013 12:19 AM, Kevin Kiley wrote: >>>> >>>> Shane Wiley wrote... >>>>>> I had removed "user agent" in the suggested edit as this could be >>>>>> something as generic as "Chrome 26". >>>>>> >>>>> It can also be something VERY specific... and tell you a LOT about >>>> the Computer/OS/Device being used. >>>> >>>> In the case of Mobile... it will pretty much tell you EXACTLY what >>>> 'Device' is being used. >>>> >>>> The FTC likewise does not use "user agent" in their definition. >>>>>> >>>>> That's true... but BOTH definitions (W3C and FTC) currently mention >>>> 'Device'... and the FTC >>>> >>>> reports go to great lengths about how important it is to exclude any >>>> knowledge of 'the Device' >>>> >>>> from the de-identified data ( especially in the case of 'Mobile >>>> Devices' ). >>>> >>>> Kevin Kiley >>>> >>> > > > -- Edward W. Felten Professor of Computer Science and Public Affairs Director, Center for Information Technology Policy Princeton University 609-258-5906 http://www.cs.princeton.edu/~felten
Received on Wednesday, 13 March 2013 17:22:34 UTC