Result for CfO on ISSUE-188 (de-identification) from Matthias Schunter (Intel Corporation) on 2014-10-22 (public-tracking@w3.org from October 2014)

From: Matthias Schunter (Intel Corporation) <mts-std@schunter.org>
Date: Wed, 22 Oct 2014 14:57:47 +0200
To: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-ID: <5447A9CB.6060603@schunter.org>

Hi TPWG Team,


The Chairs have resolved the Call for Objection on ISSUE-188 by
unanimous consensus.
We determined that the group's consensus on deidentification (ISSUE-188)
was Option A among the two choices presented in the recent Call for
Objections.  Option A received no objections from any working group
member.  On the other hand, Option B received several objections from
working group members
[https://www.w3.org/2002/09/wbs/49311/tpwg-deidentification-188/results#xq2].

The consensus text is documented below (as documented under the URL).

Regards,
matthias


-----------------------------------------
*Option A: Permanently Deidentified*

/Replace existing text with the following definition and non-normative
explanation section./

Data is *permanently de-identified* when there exists a high level of
confidence that no human subject of the data can be identified, directly
or indirectly (e.g., via association with an identifier, user agent, or
device), by that data alone or in combination with other retained or
available information.


*Separate, non-normative section*

In this specification the term 'permanently de-identified' is used for
data that has passed out of the scope of this specification and can not,
and will never, come back into scope. The organization that performs the
de-identification needs to be confident that the data can never again
identify the human subjects whose activity contributed to the data. That
confidence may result from ensuring or demonstrating that it is no
longer possible to:

  * isolate some or all records which correspond to a device or user;
  * link two or more records (either from the same database or different
    databases), concerning the same device or user;
  * deduce, with significant probability, information about a device or
    user.

Regardless of the de-identification approach, unique keys can be used to
correlate records within the de-identified dataset, provided the keys do
not exist and cannot be derived outside the de-identified dataset and
have no meaning outside the de-identified dataset (i.e. no mapping table
can exist that links the original identifiers to the keys in the
de-identified dataset.)

In the case of records in such data that relate to a single user or a
small number of users, usage and/or distribution restrictions are
advisable; experience has shown that such records can, in fact,
sometimes be used to identify the user(s) despite the technical measures
that were taken to prevent re-identification. It is also a good practice
to disclose (e.g. in the privacy policy) the process by which
de-identification of these records is done, as this can both raise the
level of confidence in the process, and allow for for feedback on the
process. The restrictions might include, for example:

  * Technical safeguards that prohibit re-identification of
    de-identified data and/or merging of the original tracking data and
    de-identified data;
  * Business processes that specifically prohibit re-identification of
    de-identified data and/or merging of the original tracking data and
    de-identified data;
  * Business processes that prevent inadvertent release of either the
    original tracking data or de-identified data;
  * Administrative controls that limit access to both the original
    tracking data and de-identified data.

Received on Wednesday, 22 October 2014 12:58:25 UTC