- From: CVS User rfieldin <cvsmail@w3.org>
- Date: Wed, 08 Oct 2014 18:10:43 +0000
- To: public-tracking-commit@w3.org
Update of /w3ccvs/WWW/2011/tracking-protection/drafts
In directory gil:/tmp/cvs-serv2237
Modified Files:
tracking-compliance-i203b.html
Log Message:
update my proposal with definition of permanently de-identified
--- /w3ccvs/WWW/2011/tracking-protection/drafts/tracking-compliance-i203b.html 2014/08/09 00:08:15 1.1
+++ /w3ccvs/WWW/2011/tracking-protection/drafts/tracking-compliance-i203b.html 2014/10/08 18:10:43 1.2
@@ -105,7 +105,7 @@
recipients of the data collected as a result of accessing that resource
(during the period in which the tracking status representation is fresh)
intend to conform to this specification with regard to that data for as
- long as that data has not been de-identified.
+ long as that data has not been <a>permanently de-identified</a>.
</p>
<p>
The remainder of this specification assumes that the origin server has
@@ -119,8 +119,8 @@
<p>
Data collection, retention, use, or sharing that does not amount to
tracking is outside the scope of this specification.
- Likewise, data that has been de-identified is outside the scope of this
- specification.
+ Likewise, data that has been <a>permanently de-identified</a> is outside
+ the scope of this specification.
</p>
<p>
Short-term, transient collection and use of data is also outside
@@ -190,35 +190,81 @@
<li>ensures that the data is only retained, accessed, and used as
directed by the contractee;</li>
<li>has no independent right to use the data other than in a
- <a>de-identified</a> form (e.g., for monitoring service integrity,
- load balancing, capacity planning, or billing); and,</li>
+ <a>permanently de-identified</a> form (e.g., for monitoring
+ service integrity, load balancing, capacity planning, or billing);
+ and,</li>
<li>has a contract in place with the contractee which is consistent
with the above limitations.</li>
</ol>
</section>
<section id="de-identified">
- <h3>De-identified</h3>
+ <h3>Permanently De-identified</h3>
<p>
- Data is <dfn>de-identified</dfn> when a party:
+ Data is <dfn>permanently de-identified</dfn> when there exists a high
+ level of confidence that no human subject of the data can be
+ identified, directly or indirectly (e.g., via association with an
+ identifier, user agent, or device), by that data alone or in
+ combination with other retained or available information.
</p>
- <ol>
- <li>has achieved a reasonable level of justified confidence that the
- data cannot be used to infer information about, or otherwise be
- linked to, a particular consumer, computer, or other device;</li>
- <li>commits to make no attempt to re-identify the data; and</li>
- <li>contractually prohibits downstream recipients from attempting to
- re-identify the data.</li>
- </ol>
- <p class="issue" data-number="188" title="Definition of de-identified (or previously, unlinkable) data">
- <strong>OPEN</strong> This definition is being actively discussed and
- may soon be replaced by a term with less baggage.
- </p>
- <p class="note">
- Note that geolocation data (of a certain precision or over a period of
- time) may itself identify otherwise de-identified data.
- </p>
- <p class="issue" data-number="202" title="Limitations on geolocation by third parties"></p>
+
+ <section id="de-identification-considerations" class="informative">
+ <h4>De-identification Considerations</h4>
+ <p>
+ The term <a>permanently de-identified</a> is
+ used for data that has passed out of the scope of this specification
+ and cannot, and will never, come back into scope. The organization
+ that performs the de-identification needs to be confident that the
+ data can never again identify the human subjects whose activity
+ contributed to the data. That confidence might result from ensuring
+ or demonstrating that it is no longer possible to:
+ </p>
+ <ul>
+ <li>isolate some or all records which correspond to a device or
+ user;</li>
+ <li>link two or more records (either from the same database or
+ different databases), concerning the same device or user;</li>
+ <li>deduce, with significant probability, information about a device
+ or user.</li>
+ </ul>
+ <p>
+ Regardless of the de-identification approach, unique keys can be
+ used to correlate records within the de-identified dataset, provided
+ the keys do not exist and cannot be derived outside the de-identified
+ dataset and have no meaning outside the de-identified dataset (i.e.
+ no mapping table can exist that links the original identifiers to
+ the keys in the de-identified dataset).
+ </p>
+ <p>
+ In the case of records in such data that relate to a single user or
+ a small number of users, usage and/or distribution restrictions are
+ advisable; experience has shown that such records can, in fact,
+ sometimes be used to identify the user or users despite technical
+ measures taken to prevent re-identification. It is also a good
+ practice to disclose (e.g. in the privacy policy) the process by
+ which de-identification of these records is done, as this can both
+ raise the level of confidence in the process, and allow for for
+ feedback on the process. The restrictions might include, for
+ example:
+ </p>
+ <ul>
+ <li>technical safeguards that prohibit re-identification of
+ de-identified data and/or merging of the original tracking
+ data and de-identified data;</li>
+ <li>business processes that specifically prohibit
+ re-identification of de-identified data and/or merging of the
+ original tracking data and de-identified data;</li>
+ <li>business processes that prevent inadvertent release of either
+ the original tracking data or de-identified data;</li>
+ <li>administrative controls that limit access to both the original
+ tracking data and de-identified data.</li>
+ </ul>
+ <p>
+ Geolocation data (of a certain precision or over a period of time)
+ might cause otherwise de-identified data to become re-identified.
+ </p>
+ <p class="issue" data-number="202" title="Limitations on geolocation by third parties"></p>
+ </section>
</section>
</section> <!-- end Terminology -->
@@ -360,8 +406,9 @@
MAY engage tracking for requests made to the designated resource, but
MUST NOT use or share any data to which DNT:1 applies until it can be
determined that it has received prior consent to do so. If not, the
- origin server MUST delete or de-identify the collected data within
- forty-eight hours.
+ origin server MUST delete or
+ <a href="#dfn-permanently-de-identified">permanently de-identify</a>
+ the collected data within forty-eight hours.
</p>
<p>
An origin server MAY send a tracking status value of
@@ -409,7 +456,8 @@
that permitted use has expired. When all such data retention
periods have expired for the permitted uses for which a given data
set has been retained, the third party MUST delete or
- <a title="de-identified">de-identify</a> that data.
+ <a href="#dfn-permanently-de-identified">permanently de-identify</a>
+ that data.
</p>
<p>
Aside from what is reasonably necessary for each permitted use,
@@ -617,7 +665,8 @@
<p>
If a party learns that it possesses data in violation of this
recommendation, it MUST, where reasonably feasible, delete or
- de-identify that data at the earliest practical opportunity, even if
+ <a href="#dfn-permanently-de-identified">permanently de-identify</a>
+ that data at the earliest practical opportunity, even if
it was previously unaware of such information practices despite
reasonable efforts to understand its information practices.
</p>
Received on Wednesday, 8 October 2014 18:10:45 UTC