W3C home > Mailing lists > Public > public-tracking@w3.org > March 2013

Re: ACTION-371: text defining de-identified data

From: Justin Brookman <justin@cdt.org>
Date: Wed, 13 Mar 2013 13:07:56 -0400
Message-ID: <5140B26C.2040909@cdt.org>
To: public-tracking@w3.org
Shane is right that we did choose to use "deidentified" instead of 
"unlinkable" at the Cambridge meeting.  So I agree we probably should 
not use "unlinkable" to define "deidentified" in the standard.  However, 
I don't see why we need to define "unlinkable" at all, as it has no 
operational meaning, and was rejected because it implied a technological 
impossibility of relinking, which is not a standard that can be 
reasonably achieved.

Justin Brookman
Director, Consumer Privacy
Center for Democracy & Technology
tel 202.407.8812
justin@cdt.org
http://www.cdt.org
@JustinBrookman
@CenDemTech

On 3/13/2013 11:35 AM, Shane Wiley wrote:
> Rob,
>
> So we're agreed unlinkability requires more processing than de-identified - good.  I would recommend we define de-identified (nearly done) and unlinkability separately to clearly demonstrate they are different points within a continuum.  We can then focus on the discussion of retention of data in its de-identified state prior to moving to the ultimate unlinkable state.
>
> - Shane
>
> -----Original Message-----
> From: Rob van Eijk [mailto:rob@blaeu.com]
> Sent: Wednesday, March 13, 2013 8:28 AM
> To: Shane Wiley
> Cc: public-tracking@w3.org
> Subject: RE: ACTION-371: text defining de-identified data
>
> Hi Shane,
>
> I hear you and understand your position. But unlinkable and de-identified are not mutual exclusive. Unlinkable data is a subset of de-identified data, they just go through another step of scrubbing).
> Adding it to the list is not hurting your position.
>
> The key towards the middle ground remains data retention, which has to be proportionate to the purpose.
>
> Rob
>
> Shane Wiley schreef op 2013-03-13 16:13:
>> Rob,
>>
>> I thought we had agreed to not mix the "unlinkable" term with
>> "de-identified" here.  In our discussions in Boston it appeared there
>> was general agreement that unlinkability in a step beyond
>> de-identified.  Once a record has been rendered de-identified, it can
>> later further be made unlinkable (using your definition of unlinkable
>> vs. the one I proposed).  This is a significant sticking point for
>> those of use attempting to find middle-ground here so hopefully we can
>> document the details in non-normative text but I'd ask that we remove
>> mention of unlinkable in the definition of de-identified at this time
>> (or else we've not really moved forward in this discussion in my
>> opinion).
>>
>> - Shane
>>
>> -----Original Message-----
>> From: Rob van Eijk [mailto:rob@blaeu.com]
>> Sent: Wednesday, March 13, 2013 5:57 AM
>> To: public-tracking@w3.org
>> Subject: RE: ACTION-371: text defining de-identified data
>>
>> Dan, Kevin,
>>
>> I would really want the unlinkability in there as well. I propose to
>> add the text:  made unlinkable
>>
>> Normative text: Data can be considered sufficiently de-identified to
>> the extent that it has been deleted, made unlinkable, modified,
>> aggregated, anonymized or otherwise manipulated in order to achieve a
>> reasonable level of justified confidence that the data cannot
>> reasonably be used to infer information about, or otherwise be linked
>> to, a particular user, user agent, computer or device.
>>
>>
>> In terms of privacy by design, de-identification through unlinkability
>> is the strongest form of de-identtification IMHO.
>>
>> Rob
>>
>> Kevin Kiley schreef op 2013-03-12 19:03:
>>> Dan,
>>>
>>> In case I wasn't being clear in my last post, I (personally) believe
>>> that
>>>
>>> User-agent should *NOT* be removed from the proposed text.
>>>
>>> I actually don't think it would do any harm to *ADD* the word
>>> 'Computer'
>>>
>>> as well ( which is present in the current FTC definition ) so it
>>> reads like this…
>>>
>>> Normative text:
>>>
>>> Data can be considered sufficiently de-identified to the extent that
>>> it
>>>
>>> has been deleted, modified, aggregated, anonymized or otherwise
>>>
>>> manipulated in order to achieve a reasonable level of justified
>>>
>>> confidence that the data cannot reasonably be used to infer
>>> information
>>>
>>> about, or otherwise be linked to, a particular user, user agent,
>>> computer or device.
>>>
>>> I think that covers it pretty well, and *NO* 'clarifying text' is
>>> necessary.
>>>
>>> Just my 2 cents.
>>>
>>> Kevin Kiley
>>>
>>> Previous message(s)…
>>>
>>> Dan,
>>>
>>> Perhaps you can add text clarifying this perspective or, much like
>>> the FTC, suffice with "device" which I believe more than covers what
>>> you're looking for here.
>>>
>>> - Shane
>>>
>>> From: Dan Auerbach [mailto:dan@eff.org]
>>>
>>> Sent: Tuesday, March 12, 2013 8:57 AM
>>>
>>> To: public-tracking@w3.org
>>>
>>> Subject: Re: ACTION-371: text defining de-identified data
>>>
>>> Shane and Kevin -- The phrase "user agent" in the text is intended to
>>> refer to a particular user agent (not "Chrome 26" but rather "the
>>> browser running on Dan's laptop". I hoped that would be clear from
>>> context, but if it's not we can clarify. I may not be able to
>>> identify your device per se, but can identify that this is the same
>>> browser as I saw before. I think this is the case with using cookies,
>>> for example. It seems more accurate to me than lumping it all under
>>> "device", and appropriate since the text of our document is elsewhere
>>> focused on user agents, unlike the FTC text.
>>>
>>> Best,
>>>
>>> Dan
>>>
>>> On 03/12/2013 12:19 AM, Kevin Kiley wrote:
>>>
>>>>> Shane Wiley wrote...
>>>>> I had removed "user agent" in the suggested edit as this could be
>>>>> something as generic as "Chrome 26".
>>> It can also be something VERY specific... and tell you a LOT about
>>> the Computer/OS/Device being used.
>>>
>>> In the case of Mobile... it will pretty much tell you EXACTLY what
>>> 'Device' is being used.
>>>
>>>>> The FTC likewise does not use "user agent" in their definition.
>>> That's true... but BOTH definitions (W3C and FTC) currently mention
>>> 'Device'... and the FTC
>>>
>>> reports go to great lengths about how important it is to exclude any
>>> knowledge of 'the Device'
>>>
>>> from the de-identified data ( especially in the case of 'Mobile
>>> Devices' ).
>>>
>>> Kevin Kiley
Received on Wednesday, 13 March 2013 17:08:42 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:45:07 UTC