Re: issues 23 and 34, happy new year's initial text for all...

On Jan 3, 2012, at 3:18 PM, David Singer wrote:

> Issue number: 23
> 
> 
> 
> Issue name: Possible exemption for analytics
> Suggested retitle: Possible exemption for outsourcing
> 
> Issue URL:
>   http://www.w3.org/2011/tracking-protection/track/issues/23

I am confused.  ISSUE-23 is not about outsourcing.
ISSUE-49 and ISSUE-73 are about outsourcing.

  http://www.w3.org/2011/tracking-protection/track/issues/49
  http://www.w3.org/2011/tracking-protection/track/issues/73

The exemption for analytics is about the right of a first party to track
and retain information about where a user came from, how the user makes
use of the first party site (time spent, clickstream, etc.), and whether
the visit leads to some form of revenue conversion at the site.  In other
words, this issue has more to do with what, if any, limitations are placed
on first party data collection when DNT is enabled.

> Section number in the FPWD: 3.4 Types of Tracking
> Contributors to this text: (Draft) David Singer, (Edit) Jonathan Mayer
> 
> Specification:
> A third-party site may operate as a first-party site if all the following conditions hold:
> 	• the data collection, retention, and use, complies with at least the requirements for first-parties;
> 	• the data collected is available only to the first party, and the third party has no independent right to use the data;
> 	• the third party makes commitments to adhere to this standard in a form that is legally enforceable (directly or indirectly) by the first party, individual users, and regulators; data retention by the third party must not survive the end of this legal enforceability;
> 	• the third party undertakes reasonable technical precautions to prevent collecting data that could be correlated across first parties.
> 
> Non-normative Discussion:
> The rationale for rule (2) is that we allow the third party to stand in the first party’s shoes – but go no further.  The third party may not use the data it collects for “product improvement,” “aggregate analytics,” or any other purpose except to fulfill a request by a first party, where the results are shared only with the first party.
> 
> Rule (3) allows for the possibility of more than one level of outsourcing.
> 
> In rule (4), one component of reasonable technical precautions will often be using the same-origin policy to segregate information for each first-party customer.

That's all good text for ISSUE-73.

> Note that any data collected by the third party that is used, or may be used, in any way by any party other than the first party, is subject to the requirements for third parties.

I don't understand that sentence.

> Example:
> ExampleAnalytics collects analytic data for ExampleProducts Inc..  It operates a site under the DNS analytics.exampleproducts.com. It collects and analyzes data on visits to ExampleProducts, and provides that data solely to ExampleProducts, and does not access or use it itself.
> 
> Text that possibly belongs in other sections:
> When the third party sends a response header, that header must indicate that that they are a third party and that they are operating under this exception.

Eh?  They are operating as the first party.

> Note that a third party that operates under a domain name or other arrangement that makes it appear to the user as if they are the first party, or a part or affiliate of the first party, is nonetheless a third party and is subject to the requirements of this clause ("DNS masquerading").

That is confusing.  I think what you mean is that a third party that does not
conform to the above conditions cannot operate as a first party even if the
data is collected through a shared domain or subdomain of the first party.

> 
> 
> 
> Issue number: 34
> Issue name: Possible exemption for aggregate analytics
> Suggested retitle: Possible exemption for unidentifiable data
> 
> Issue URL:
>   http://www.w3.org/2011/tracking-protection/track/issues/34
> 
> Section number in the FPWD: 3.4 Types of Tracking
> Contributors to this text: (Draft) David Singer, (Edit) Jonathan Mayer
> 
> Specification:
> A third party may collect, retain, and use any information from a user or user agent that, with high probability, could not be used to:
> 1) identify or nearly identify a user or user agent; or
> 2) correlate the activities of a user or user agent across multiple network interactions.

Again, totally confused.  Analytics are defined as correlating the activities of
a user across multiple network interactions.  Do you mean tracking the user across
multiple sites?  ISSUE-34 is about sharing analytics data in aggregate form
(e.g., a manufacturer might want to obtain aggregate information about
both the types of users that purchase their products and the types of
users that spend some threshold of time looking at the products but
do not result in a purchase.  Is it okay for the shop to share that data
with the manufacturer if the data shared is in an aggregate form that cannot
be used to identify individual users?

Also, keep in mind that fraud prevention requires at least some data
collection and retention by third parties.

> Examples:
> 1. A third-party advertising network records the fact that it displayed an ad. 
> 2. A third-party analytics service counts the number of times a popular page was loaded.
> 
> Non-Normative Discussion:
> This exception (like all exceptions) may not be combined with other exceptions unless specifically allowed.  A third party acting within the outsourcing exception, for example, may not make independent use of the data it has collected even though the use involves unidentifiable data.  A rule to the contrary would provide a perverse incentive for third parties to press all exceptions to the limit and then use the collected data within this exception.
> A potential ‘safe harbor’ under this clause could be to retain only aggregate counts, not per-transaction records.

I don't understand why we care.  Aggregate counts that do not identify users
are not a privacy concern and do not amount to tracking in any sense that
the user would intend to disable by DNT.

> Text that possibly belongs elsewhere:
> Possible advances in de-anonymization that make previously non-identifiable data, identifiable, should be considered.  
> [Maybe need an issue: whose problem is it when data from disparate sources, all but one of which are anonymous, is combined to achieve de-anonymization?]

AFAIK, aggregate data cannot be combined.  Anonymized data can often be
combined if it remains in non-aggregated form.

Cheers,

....Roy

Received on Tuesday, 10 January 2012 11:23:11 UTC