W3C home > Mailing lists > Public > public-tracking@w3.org > March 2013

Re: Fw: New text Issue 25: Aggregated data: collection and use for audience measurement research

From: Rigo Wenning <rigo@w3.org>
Date: Thu, 07 Mar 2013 18:15:03 +0100
To: public-tracking@w3.org
Cc: Kathy Joe <kathy@esomar.org>, peter@peterswire.net, justin@cdt.org
Message-ID: <1516828.XmkxSUueIC@hegel.sophia.w3.org>
I think key points are: 

1/ Panels are not an issue as they are based on consent anyway. The 
question is rather how to leverage DNT:0 to better get to consent for 
panels. Everything that requires "out-of-band" stuff on the Web 
diminishes the utility of DNT and DNT:0. 

2/ There is the calibration part. The devil is in the detail here. 
What would be the smallest percentage of DNT:0 users in a given 
clickstream so that calibration could still happen? Because I think if 
calibration is done with DNT:0 users, there is no issue. (Get a web-wide 

Instead of keeping the data, what about aggregate on the fly and have 
the software be certified by somebody? This would avoid the retention of 
data over 53 weeks. There is a very contentious discussion about data 
retention in Europe. Wanting 53 weeks data retention for DNT:1 while law 
enforcement will only get 24 weeks is recipe for more contention. 

=> calibration is our central issue. How can we do calibration either 
with sufficient DNT:0 or without data collection that foils the DNT:1 
goals. This isn't easy


On Wednesday 06 March 2013 13:34:22 Kathy Joe wrote:
> The panel output is calibrated by counting actual hits on tagged
> content and re-adjusting the results in order to ensure data produced
> from the panel accurately represents the whole audience. The counts
> must be pseudonomised. Counts are retained for sample, quality
> control, and auditing purposes during which time contractual measures
> must be in place to limit access to, and protect the data from other
> uses. A 53 week retention period is necessary so that month over
> month reports for a one year period may be re-run for quality
> checking purposes, after which the data must be de-identified. The
> counted data is largely collected on a first party basis, but to
> ensure complete representation, some will be third party placement.
> This collection tracks the content rather than involving the
> collection of a user's browser history.
Received on Thursday, 7 March 2013 17:15:33 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:45:07 UTC