- From: Mike O'Neill <michael.oneill@baycloud.com>
- Date: Sun, 10 Mar 2013 23:31:21 -0000
- To: "'Kathy Joe'" <kathy@esomar.org>
- Cc: <public-tracking@w3.org>
- Message-ID: <0bee01ce1de7$5ff42610$1fdc7230$@baycloud.com>
Hi Kathy, If the market research data set is being retained for tabulation & aggregation reruns then there is no reason to keep the UIDs. The easiest way to do that, and the most transparent, is give the encoding cookies a very short lifetime. The data records would still be addressable with a key (the UID), but they will not be linkable back to the user-agent from which they had been collected. In my opinion this is the only way that an exemption for market research data (in the absence of DNT:0 aka Tracking Consent) could be acceptable. Processing the retained data to remove identifying data (people's names, email addresses etc. that may be in there) in the Urls, which the only definition of pseudonymisation that makes sense, is a good idea. I do not see much point in retaining truncated IP addresses, they might as well be deleted. Mike From: Kathy Joe [mailto:kathy@esomar.org] Sent: 09 March 2013 11:33 To: rigo@w3.org; public-tracking@w3.org Cc: peter@peterswire.net; justin@cdt.org Subject: New text Issue 25: Aggregated data: collection and use for audience measurement research Hi Rigo, Thanks for your comments. Reducing the calibration to a small percentage of a specific group would create unreliable statistics because of bias, whilst the objective of audience measurement research is to provide confidence in the metrics. The pseudonymised data is retained for that specific period so it can be re-run if month by month checks are needed, as required by the audience measurement standards defined by the joint industry bodies overseeing media measurement around the world which also manage the auditing in their particular market. http://www.abc.org.uk/PageFiles/50/Web%20Traffic%20Audit%20Rules%20and%20Gui dance%20Notes%20version2%20March%202013%20master.pdf http://www.i-jic.org/member.php?id=7 <http://www.i-jic.org/member.php?id=7&PHPSESSID=55143f172846ed39c7958cbeb837 a85a> &PHPSESSID=55143f172846ed39c7958cbeb837a85a http://www.i-jic.org/index.php?PHPSESSID=55143f172846ed39c7958cbeb837a85a Kathy From: Rigo Wenning [mailto:rigo@w3.org] To: public-tracking@w3.org Cc: Kathy Joe [mailto:kathy@esomar.org], peter@peterswire.net, justin@cdt.org Sent: Thu, 07 Mar 2013 18:15:03 +0100 Subject: Re: Fw: New text Issue 25: Aggregated data: collection and use for audience measurement research I think key points are: 1/ Panels are not an issue as they are based on consent anyway. The question is rather how to leverage DNT:0 to better get to consent for panels. Everything that requires "out-of-band" stuff on the Web diminishes the utility of DNT and DNT:0. 2/ There is the calibration part. The devil is in the detail here. What would be the smallest percentage of DNT:0 users in a given clickstream so that calibration could still happen? Because I think if calibration is done with DNT:0 users, there is no issue. (Get a web-wide exception) Instead of keeping the data, what about aggregate on the fly and have the software be certified by somebody? This would avoid the retention of data over 53 weeks. There is a very contentious discussion about data retention in Europe. Wanting 53 weeks data retention for DNT:1 while law enforcement will only get 24 weeks is recipe for more contention. => calibration is our central issue. How can we do calibration either with sufficient DNT:0 or without data collection that foils the DNT:1 goals. This isn't easy --Rigo On Wednesday 06 March 2013 13:34:22 Kathy Joe wrote: > The panel output is calibrated by counting actual hits on tagged > content and re-adjusting the results in order to ensure data produced > from the panel accurately represents the whole audience. The counts > must be pseudonomised. Counts are retained for sample, quality > control, and auditing purposes during which time contractual measures > must be in place to limit access to, and protect the data from other > uses. A 53 week retention period is necessary so that month over > month reports for a one year period may be re-run for quality > checking purposes, after which the data must be de-identified. The > counted data is largely collected on a first party basis, but to > ensure complete representation, some will be third party placement. > This collection tracks the content rather than involving the > collection of a user's browser history.
Received on Sunday, 10 March 2013 23:32:01 UTC