recent Compliance edits (was Fwd: CVS WWW/2011/tracking-protection/drafts)

Summary of some recent revisions to the Compliance draft (including the diff below).

Thanks,
Nick


## deidentification

Added definition agreed via Call for Objections. Changed all instances to refer to "permanently deidentified" (or similar). Non-normative section currently immediately follows the definition (and includes the geolocation note from before), but we could move this elsewhere if it reads as too verbose.

## editorial issues to cross off the list

I'm crossing each of the below off my wiki list:
	https://www.w3.org/wiki/Privacy/TPWG/Editorial_corrections
but if someone wants to note that I didn't fix the problem, please let me know.

http://lists.w3.org/Archives/Public/public-tracking-comments/2014May/0000.html

* one of these had been fixed previously; I corrected the other typo today

http://lists.w3.org/Archives/Public/public-tracking/2013Jun/0461.html

* resolved since the User Agent compliance was moved over to TPE (separating the requirements, as requested)

http://lists.w3.org/Archives/Public/public-tracking/2013Jun/0339.html

* I believe this is resolved by a MUST requirement on using the 'compliance' field to indicate compliance with the document:
http://www.w3.org/2011/tracking-protection/drafts/tracking-compliance.html#indicating-compliance

http://www.w3.org/2011/tracking-protection/track/issues/218

* This may now be resolved. Noting short-term out of scope is done at the very beginning of server compliance. Deidentified data is noted as allowed in a bullet point at the third-party compliance section.

Begin forwarded message:

> Resent-From: public-tracking-commit@w3.org
> From: "CVS User npdoty" <cvsmail@w3.org>
> Subject: CVS WWW/2011/tracking-protection/drafts
> Date: October 7, 2014 at 5:26:02 PM PDT
> To: public-tracking-commit@w3.org
> Archived-At: <http://www.w3.org/mid/E1Xbf4s-0006mu-Nh@gil.w3.org>
> 
> Update of /w3ccvs/WWW/2011/tracking-protection/drafts
> In directory gil:/tmp/cvs-serv26092
> 
> Modified Files:
> 	tracking-compliance.html
> Log Message:
> implement issue-188 change; fix some editorial issues
> 
> --- /w3ccvs/WWW/2011/tracking-protection/drafts/tracking-compliance.html	2014/10/01 07:02:11	1.126
> +++ /w3ccvs/WWW/2011/tracking-protection/drafts/tracking-compliance.html	2014/10/08 00:26:02	1.127
> @@ -150,7 +150,7 @@
>       <ol>
>         <li>processes the data on behalf of the contractee;</li>
>         <li>ensures that the data is only retained, accessed, and used as directed by the contractee;</li>
> -        <li>has no independent right to use the data other than in a <a>deidentified</a> form (e.g., for monitoring service integrity, load balancing, capacity planning, or billing); and,</li>
> +        <li>has no independent right to use the data other than in a <a>permanently deidentified</a> form (e.g., for monitoring service integrity, load balancing, capacity planning, or billing); and,</li>
>         <li>has a contract in place with the contractee which is consistent with the above limitations.</li>
>       </ol>
>       </section>
> @@ -187,29 +187,36 @@
> 			</p></section>
> 
> 			<section id="deidentified">
> -			<h3>Deidentified</h3>
> -			<p>
> -				Data is <dfn>deidentified</dfn> when a party:
> -			</p>
> -			<ol>
> -				<li>
> -					has achieved a reasonable level of justified confidence that the
> -				       data cannot be used to infer information about, or otherwise be
> -				       linked to, a particular consumer, computer, or other device;
> -				</li>
> -				<li>
> -					commits to make no attempt to re-identify the data; and
> -				</li>
> -				<li>
> -					contractually prohibits downstream recipients from attempting to
> -          re-identify the data.
> -				</li>
> -			</ol>
> -			<p class="note">
> -			  Note that geolocation data (of a certain precision or over a period of time) may itself identify otherwise deidentified data.
> -			</p>
> -			<p class="issue" data-number="188" title="Definition of de-identified (or previously, unlinkable) data"></p>
> -			<p class="issue" data-number="202" title="Limitations on geolocation by third parties"></p>
> +			<h3>Deidentification</h3>
> +      <p>
> +        Data is <dfn>permanently deidentified</dfn> when there exists a high level of confidence that no human subject of the data can be identified, directly or indirectly (e.g., via association with an identifier, user agent, or device), by that data alone or in combination with other retained or available information.
> +      </p>
> +			<section id="deidentified-considerations" class="informative">
> +        <h4>Deidentification Considerations</h4>
> +        <p>
> +          In this specification the term <a>permanently deidentified</a> is used for data that has passed out of the scope of this specification and can not, and will never, come back into scope. The organization that performs the deidentification needs to be confident that the data can never again identify the human subjects whose activity contributed to the data. That confidence may result from ensuring or demonstrating that it is no longer possible to:
> +        </p>
> +        <ul>
> +            <li>isolate some or all records which correspond to a device or user;</li>
> +            <li>link two or more records (either from the same database or different databases), concerning the same device or user;</li>
> +            <li>deduce, with significant probability, information about a device or user.</li>
> +        </ul>
> +        <p>
> +          Regardless of the deidentification approach, unique keys can be used to correlate records within the deidentified dataset, provided the keys do not exist and cannot be derived outside the deidentified dataset and have no meaning outside the deidentified dataset (i.e. no mapping table can exist that links the original identifiers to the keys in the deidentified dataset).
> +        </p>
> +        <p>
> +          In the case of records in such data that relate to a single user or a small number of users, usage and/or distribution restrictions are advisable; experience has shown that such records can, in fact, sometimes be used to identify the user or users despite technical measures taken to prevent reidentification. It is also a good practice to disclose (e.g. in the privacy policy) the process by which deidentification of these records is done, as this can both raise the level of confidence in the process, and allow for for feedback on the process. The restrictions might include, for example:
> +        </p>
> +        <ul>
> +            <li>technical safeguards that prohibit reidentification of deidentified data and/or merging of the original tracking data and deidentified data;</li>
> +            <li>business processes that specifically prohibit reidentification of deidentified data and/or merging of the original tracking data and deidentified data;</li>
> +            <li>business processes that prevent inadvertent release of either the original tracking data or deidentified data;</li>
> +            <li>administrative controls that limit access to both the original tracking data and deidentified data.</li>
> +        </ul>
> +        <p>
> +          Geolocation data (of a certain precision or over a period of time) may itself identify otherwise deidentified data.
> +        </p>
> +      </section>
> 			</section>
> 			<section id="tracking">
> 				<h3>Tracking</h3>
> @@ -244,7 +251,7 @@
> 			<section id="graduated-response">
> 				<h3>Graduated Response</h3>
> 				<p>
> -					A <dfn>graduated response</dfn> a methodology where the action taken is proportional to the size of the problem or risk that is trying to be mitigated. In the context of this document, the term is used to describe an increase in the collection of data about a user or interaction in response to a specific problem that a party has become aware of, such as an increase in fraudulent activity originating from a particular network or IP address range resulting in increased logging of data relating to interactions from that specific range of IP addresses as opposed to increased logging for all users in general.
> +					A <dfn>graduated response</dfn> is a methodology where the action taken is proportional to the size of the problem or risk that is trying to be mitigated. In this specification, the term is used to describe an increase in the collection of data about a user or interaction in response to a specific problem that a party has become aware of, such as an increase in fraudulent activity originating from a particular network or IP address range resulting in increased logging of data relating to interactions from that specific range of IP addresses, as opposed to increased logging for all users in general.
> 				</p>
> 				<p class="note">
>   				  Only used in security, below, and may overlap with the explanation
> @@ -324,11 +331,11 @@
>     <ol start="1">
>       <li>a user has explicitly-granted an exception, as described below;</li>
>       <li>data is collected for the set of permitted uses described below;</li>
> -      <li>or, the data is de-identified as defined in this recommendation.</li>
> +      <li>or, the data is <a>permanently deidentified</a> as defined in this specification.</li>
> 		</ol>
>     <aside class="example">
>       <p>
> -        An embedded widget provider (a third party to users' interactions with various sites) counts visitors' country of origin and device type but removes identifiers in order to <a title="deidentified">deidentify</a> collected data. For the purposes of this recommendation, the party is not <a>tracking</a> the user and can create a static site-wide tracking status resource with a tracking status value of <code>N</code> to indicate that status.
> +        An embedded widget provider (a third party to users' interactions with various sites) counts visitors' country of origin and device type but removes identifiers in order to <a title="permanently deidentified">permanently deidentify</a> collected data. For the purposes of this recommendation, the party is not <a>tracking</a> the user and can create a static site-wide tracking status resource with a tracking status value of <code>N</code> to indicate that status.
>       </p>
>     </aside>
> 		<p>
> @@ -380,7 +387,7 @@
>             different permitted uses. Data MUST NOT be used for a permitted
>             use once the data retention period for that permitted use has
>             expired. After there are no remaining permitted uses for given
> -            data, the data MUST be deleted or <a>deidentified</a>.
> +            data, the data MUST be deleted or <a>permanently deidentified</a>.
> 					</p>
> 					<p class="issue" data-number="199" title="Limitations on the use of unique identifiers"></p>
>         </section>
> 
> 

Received on Wednesday, 8 October 2014 00:32:54 UTC