- From: CVS User rfieldin <cvsmail@w3.org>
- Date: Wed, 08 Oct 2014 18:10:43 +0000
- To: public-tracking-commit@w3.org
Update of /w3ccvs/WWW/2011/tracking-protection/drafts In directory gil:/tmp/cvs-serv2237 Modified Files: tracking-compliance-i203b.html Log Message: update my proposal with definition of permanently de-identified --- /w3ccvs/WWW/2011/tracking-protection/drafts/tracking-compliance-i203b.html 2014/08/09 00:08:15 1.1 +++ /w3ccvs/WWW/2011/tracking-protection/drafts/tracking-compliance-i203b.html 2014/10/08 18:10:43 1.2 @@ -105,7 +105,7 @@ recipients of the data collected as a result of accessing that resource (during the period in which the tracking status representation is fresh) intend to conform to this specification with regard to that data for as - long as that data has not been de-identified. + long as that data has not been <a>permanently de-identified</a>. </p> <p> The remainder of this specification assumes that the origin server has @@ -119,8 +119,8 @@ <p> Data collection, retention, use, or sharing that does not amount to tracking is outside the scope of this specification. - Likewise, data that has been de-identified is outside the scope of this - specification. + Likewise, data that has been <a>permanently de-identified</a> is outside + the scope of this specification. </p> <p> Short-term, transient collection and use of data is also outside @@ -190,35 +190,81 @@ <li>ensures that the data is only retained, accessed, and used as directed by the contractee;</li> <li>has no independent right to use the data other than in a - <a>de-identified</a> form (e.g., for monitoring service integrity, - load balancing, capacity planning, or billing); and,</li> + <a>permanently de-identified</a> form (e.g., for monitoring + service integrity, load balancing, capacity planning, or billing); + and,</li> <li>has a contract in place with the contractee which is consistent with the above limitations.</li> </ol> </section> <section id="de-identified"> - <h3>De-identified</h3> + <h3>Permanently De-identified</h3> <p> - Data is <dfn>de-identified</dfn> when a party: + Data is <dfn>permanently de-identified</dfn> when there exists a high + level of confidence that no human subject of the data can be + identified, directly or indirectly (e.g., via association with an + identifier, user agent, or device), by that data alone or in + combination with other retained or available information. </p> - <ol> - <li>has achieved a reasonable level of justified confidence that the - data cannot be used to infer information about, or otherwise be - linked to, a particular consumer, computer, or other device;</li> - <li>commits to make no attempt to re-identify the data; and</li> - <li>contractually prohibits downstream recipients from attempting to - re-identify the data.</li> - </ol> - <p class="issue" data-number="188" title="Definition of de-identified (or previously, unlinkable) data"> - <strong>OPEN</strong> This definition is being actively discussed and - may soon be replaced by a term with less baggage. - </p> - <p class="note"> - Note that geolocation data (of a certain precision or over a period of - time) may itself identify otherwise de-identified data. - </p> - <p class="issue" data-number="202" title="Limitations on geolocation by third parties"></p> + + <section id="de-identification-considerations" class="informative"> + <h4>De-identification Considerations</h4> + <p> + The term <a>permanently de-identified</a> is + used for data that has passed out of the scope of this specification + and cannot, and will never, come back into scope. The organization + that performs the de-identification needs to be confident that the + data can never again identify the human subjects whose activity + contributed to the data. That confidence might result from ensuring + or demonstrating that it is no longer possible to: + </p> + <ul> + <li>isolate some or all records which correspond to a device or + user;</li> + <li>link two or more records (either from the same database or + different databases), concerning the same device or user;</li> + <li>deduce, with significant probability, information about a device + or user.</li> + </ul> + <p> + Regardless of the de-identification approach, unique keys can be + used to correlate records within the de-identified dataset, provided + the keys do not exist and cannot be derived outside the de-identified + dataset and have no meaning outside the de-identified dataset (i.e. + no mapping table can exist that links the original identifiers to + the keys in the de-identified dataset). + </p> + <p> + In the case of records in such data that relate to a single user or + a small number of users, usage and/or distribution restrictions are + advisable; experience has shown that such records can, in fact, + sometimes be used to identify the user or users despite technical + measures taken to prevent re-identification. It is also a good + practice to disclose (e.g. in the privacy policy) the process by + which de-identification of these records is done, as this can both + raise the level of confidence in the process, and allow for for + feedback on the process. The restrictions might include, for + example: + </p> + <ul> + <li>technical safeguards that prohibit re-identification of + de-identified data and/or merging of the original tracking + data and de-identified data;</li> + <li>business processes that specifically prohibit + re-identification of de-identified data and/or merging of the + original tracking data and de-identified data;</li> + <li>business processes that prevent inadvertent release of either + the original tracking data or de-identified data;</li> + <li>administrative controls that limit access to both the original + tracking data and de-identified data.</li> + </ul> + <p> + Geolocation data (of a certain precision or over a period of time) + might cause otherwise de-identified data to become re-identified. + </p> + <p class="issue" data-number="202" title="Limitations on geolocation by third parties"></p> + </section> </section> </section> <!-- end Terminology --> @@ -360,8 +406,9 @@ MAY engage tracking for requests made to the designated resource, but MUST NOT use or share any data to which DNT:1 applies until it can be determined that it has received prior consent to do so. If not, the - origin server MUST delete or de-identify the collected data within - forty-eight hours. + origin server MUST delete or + <a href="#dfn-permanently-de-identified">permanently de-identify</a> + the collected data within forty-eight hours. </p> <p> An origin server MAY send a tracking status value of @@ -409,7 +456,8 @@ that permitted use has expired. When all such data retention periods have expired for the permitted uses for which a given data set has been retained, the third party MUST delete or - <a title="de-identified">de-identify</a> that data. + <a href="#dfn-permanently-de-identified">permanently de-identify</a> + that data. </p> <p> Aside from what is reasonably necessary for each permitted use, @@ -617,7 +665,8 @@ <p> If a party learns that it possesses data in violation of this recommendation, it MUST, where reasonably feasible, delete or - de-identify that data at the earliest practical opportunity, even if + <a href="#dfn-permanently-de-identified">permanently de-identify</a> + that data at the earliest practical opportunity, even if it was previously unaware of such information practices despite reasonable efforts to understand its information practices. </p>
Received on Wednesday, 8 October 2014 18:10:45 UTC