W3C home > Mailing lists > Public > public-tracking@w3.org > August 2014

RE: Deidentification (ISSUE-188)

From: Mike O'Neill <michael.oneill@baycloud.com>
Date: Thu, 28 Aug 2014 10:55:12 +0100
To: "'Roy T. Fielding'" <fielding@gbiv.com>, "'David Singer'" <singer@apple.com>, <vtoubiana@cnil.fr>, <rob@blaeu.com>
Cc: <public-tracking@w3.org>
Message-ID: <0adc01cfc2a6$2c11d6c0$84358440$@baycloud.com>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

>>Data is permanently de-identified when there exists a high level of confidence that no human subject of the data can be identified, directly or indirectly, by that data alone or in combination with  other retained or available information.

Roy, I think this is a good definition. Can we add the non-normative specific example about UIDs below. The clause about a enabling communication and requested service is there to cover publisher logins and session state (and IP addresses). 

Non-normative example:

In the interests of transparency this implies that any data used or stored in a user agent or device for the purpose of identifying it in subsequent requests, unless solely used to enable communication or to supply a service requested by the user, will have been deleted or, if this is unfeasible, otherwise made ineffective.

Mike


> -----Original Message-----
> From: Roy T. Fielding [mailto:fielding@gbiv.com]
> Sent: 26 August 2014 19:57
> To: David Singer
> Cc: public-tracking@w3.org WG
> Subject: Re: Deidentification (ISSUE-188)
> 
> I am still in favor of a short definition that makes it very clear what
> we want to achieve in terms of limiting the data.  If folks want to place
> additional requirements on a party, separate from the definition of the
> state we want the data to be in, then I think that should be discussed
> and agreed on separately.
> 
> To that end, I have replaced my proposal with the following:
> 
>    Data is permanently de-identified when there exists a high level
>    of confidence that no human subject of the data can be identified,
>    directly or indirectly, by that data alone or in combination with
>    other retained or available information.
> 
> If adopted, we would replace all occurrences of "de-identif(y|ied|ying)"
> in TCS and TPE with permanently de-identified.
> 
> Rationale:
> 
> I adopted David's "permanently de-identified" to avoid the association
> with re-identifiable data and added "combination with other retained ...
> information" to exclude holding onto a key for re-identification.
> 
> I replaced "user" with "human subject of the data", since we also want
> to remove data provided by the user that (inadvertently) is about
> others (what most statistic-based data trimming does automatically).
> However, we don't want to remove data which might be about a human
> who is not the subject (e.g., recording the number of distinct visitors
> to my blog is data about the visitors, not about me).
> 
> I use "directly or indirectly" to indicate that this includes anything
> that might end up identifying a human subject, no matter how.
> If someone thinks we should have specific text about identifiers on
> user agents or devices, that can be a non-normative example without
> weakening this definition.
> 
> 
> Cheers,
> 
> Roy T. Fielding                     <http://roy.gbiv.com/>
> Senior Principal Scientist, Adobe   <http://www.adobe.com/>
> 

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.13 (MingW32)
Comment: Using gpg4o v3.3.26.5094 - http://www.gpg4o.com/
Charset: utf-8

iQEcBAEBAgAGBQJT/vx/AAoJEHMxUy4uXm2Jb3IIAMkNjjpo4qVQCjSSi/WtObul
lAyd/FTnFuL8/UHIYfGJeXpIFHHOMjBaYgEKF33bZm/99FV1JKZrnqG6hqJW0a+O
qp3hSLIanqo1NWPAXndBZV2w7f32qcEJdhOom4sL1wkEScrtN137B8hynQ2E3olM
OxDrX4TS8iN8BC/UGtCEOZgIfzJH/ZSIPfwG4PfBTuLdEqF0y4HhmwfkhBgui2//
rprlQeq6SEPcS+6HcL5kYZ9ohbJF4BDLLqrj3gmHFPxBiwD6Iu90jcjOsO+58B6f
GjuEb7mnGJQgrzOl9ZpnM5VN+9wnUOmEi8UyD4vHWunbDgw5XSyCyBQMJIhNbO8=
=CrHy
-----END PGP SIGNATURE-----
Received on Thursday, 28 August 2014 09:55:56 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:40:12 UTC