Re: Deidentification (ISSUE-188)

> On Aug 28, 2014, at 2:55 AM, "Mike O'Neill" <michael.oneill@baycloud.com> wrote:
> 
>>> Data is permanently de-identified when there exists a high level of confidence that no human subject of the data can be identified, directly or indirectly, by that data alone or in combination with  other retained or available information.
> 
> Roy, I think this is a good definition. Can we add the non-normative specific example about UIDs below. The clause about a enabling communication and requested service is there to cover publisher logins and session state (and IP addresses). 
> 
> Non-normative example:
> 
> In the interests of transparency this implies that any data used or stored in a user agent or device for the purpose of identifying it in subsequent requests, unless solely used to enable communication or to supply a service requested by the user, will have been deleted or, if this is unfeasible, otherwise made ineffective.

I don't think it implies anything of the sort, so this would be a normative addition. It isn't a good idea to suggest that servers delete client-side storage, since the only way to do that is pervasive callbacks with no privacy at all, and there might well be identifiers that are still effective for other (still identified) data sets. What matters is that they not be traceable to the de-identified data.

A better suggestion would be to take steps to ensure the de-identified data does not contain any client-side identifier, nor data sufficient to generate a client-side identifier, since that would likely remain an indirect identifier for the user. 

....Roy

Received on Thursday, 28 August 2014 19:32:04 UTC