Re: ISSUE-25: Audience measurement research : answers to questions from Jonathan Mayer on 2013-07-24 (public-tracking@w3.org from July 2013)

From: Jonathan Mayer <jmayer@stanford.edu>
Date: Tue, 23 Jul 2013 17:55:07 -0700
To: Kathy Joe <kathy@esomar.org>
Cc: "public-tracking@w3.org" <public-tracking@w3.org>, Peter Swire <peter@peterswire.net>
Message-ID: <50AF787BF2A9435CB49775ED25E57ADD@gmail.com>
Kathy,  

Thanks, this helps.  If I read correctly…

"Calibrate" means correcting total panel bias with non-demographic and non-behavioral census data.

"Calculate" means correcting per-item panel bias with non-demographic and non-behavioral census data.

"Validate" means auditing of both panel and census data, with potentially > 1 year retention.

If those interpretations are correct, I would strongly urge an edit that clarifies the scope of the three terms.  They are, in the present draft, quite vague and broad.

Best,
Jonathan


On Tuesday, July 23, 2013 at 9:12 AM, Kathy Joe wrote:

> During and after last Wednesday’s discussion, some questions and comments came up about Issue 25 and audience measurement research.   
>   
> Because there is still some confusion about the rationale underlying the language used in Issue 25 and it is difficult to answer these piecemeal without referring to the distinct purpose and process of AMR, we have described these in the attached document.
>   
> For these questions below, please refer to pages 1 to 3:
>   
> Aleecia, Jonathan, Justin, Jeff, Ninja, John, Mike: Why should AMR have a permitted use to over-ride users who have turned on DNT? Pages 1 & 2
>   
> Jonathan: What do "validate" and "calculate" mean.  How do they differ from the "calibrate" use case that we've discussed at length?  What use cases are we missing if we merely say "calibrate"? Pages 1 & 2
>   
> For the questions asked by Ed Felton and others about Issue 25 text:
>   
> “The data collected by the third party:
> Must be pseudonymized before statistical analysis begins, such that unique key-coded data are used to distinguish one individual from another without identifying them", and on independent certification process, please refer to pages 2 & 3.
>  
> Hoping this is helpful  
>   
> Best regards
>   
>   
> Kathy Joe,
> Director, International Standards and Public Affairs
>  
>  
>  
>  
>  
>  
>  
>  
> From: Dan Auerbach <dan@eff.org (mailto:dan@eff.org)>
> Date: Wednesday, July 17, 2013 6:41 PM
> To: <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> Subject: Re: ISSUE-25: text to be discussed in today's call
> Resent-From: <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> Resent-Date: Wed, 17 Jul 2013 16:42:31 +0000
>  
> Hi Kathy,
>  
> I came late and so am hesitant to ask a question that has already been answered on the call. But I'm fundamentally confused about the need for audience measurement as a separate permitted use. The presentation in Sunnyvale on this topic (or at least I believe it is this topic) clearly demonstrated to me that audience measurement: (1) is able to account for users who do not send cookies via basic statistical techniques in order to accurately measure uniques, and (2) is interested only in broad aggregate data and reach statistics, which I understand to be covered by de-identified data (or green).
>  
> Suppose audience measurement is NOT adopted as a permitted use. What collection and retention activities would be prohibited that are necessary for audience measurement purposes?
>  
> Thanks,
> Dan
> From: "Edward W. Felten" <felten@CS.Princeton.EDU (mailto:felten@CS.Princeton.EDU)>
> Date: Wednesday, July 17, 2013 5:42 PM
> To: Peter Swire <peter@peterswire.net (mailto:peter@peterswire.net)>
> Cc: Kathy <kathy@esomar.org (mailto:kathy@esomar.org)>, "public-tracking@w3.org (mailto:public-tracking@w3.org)" <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> Subject: Re: ISSUE-25: text to be discussed in today's call
> Resent-From: <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> Resent-Date: Wed, 17 Jul 2013 15:43:10 +0000
>  
> I plan to ask some questions about this text, time permitting.
>  
> First, on the requirement that data "Must be pseudonymized before statistical analysis begins, such that unique key-coded data are used to distinguish one individual from another without identifying them".   Questions about this:
>  
> (1) What does "identifying" mean in this text?   (One might read "without identifying" as requiring that data be "de-identified" according to the definition that appears elsewhere in the spec.   But if the data qualifies as de-identified then no permitted use is required here because the general safe harbor for de-identified data already applies.   Alternatively, if "identifying" means something different here, then that should be spelled out.)
>  
> (2) What does "unique key-coded data" mean?  Is the text about "unique key-coded data ..." meant to serve as a definition of "pseudonymized"?   If so, it seems overly prescriptive, requiring one particular method that (purportedly) qualifies as pseudonymized.    Alternatively, this text might be read as requiring a particular (purported) pseudonymization method.   If so, why require this particular method?
>  
> (3) Why allow pseudonymization to be delayed until "statistical analysis begins"?  Why not require pseudonymization to be done promptly when data is initially collected?
>  
>  
> Second, regarding the "independent certification process under the oversight of a generally-accepted market research industry organization that maintains a web platform providing user information about audience measurement research.   This web platform lists the parties eligible to collect information under DNT standards and the audience measurement research permitted use  ..."    
>  
> (1) The authors appear to have a specific organization in mind.  Which organization is that, and who runs it?    
>  
> (2) What is the rationale for giving a particular organization control over the the certification process and the ability to declare who is eligible to exercise this permitted use?
>  
>  
>  
> On Wed, Jul 17, 2013 at 9:46 AM, Peter Swire <peter@peterswire.net (mailto:peter@peterswire.net)> wrote:
> >  
> > Good morning:
> >  
> >  
> >  
> >  
> >  
> >  
> > So that we're sure to be on the same page for this, here is the normative and non-normative text on audience measurement for today's call.  Edits in red in light of recent discussions that Kathy Joe and her group have had with a number of WG members.
> >  
> >  
> >  
> >  
> >  
> >  
> > Thanks,
> >  
> >  
> >  
> >  
> >  
> >  
> > Peter
> >  
> >  
> >  
> >  
> >  
> >  
> > ==
> >  
> >  
> >  
> >  
> >  
> >  
> > Issue 25: Aggregated data collection and use for audience measurement research 4 July 2013
> >  
> >  
> >  
> >  
> >  
> >  
> > Normative:
> >  
> >  
> > Information may be collected, retained and used by a third party for audience measurement research
> >  
> >  
> > where the information is used to calibrate, validate or calculate through data collected from opted-in
> >  
> >  
> > panels, which in part contains information collected across sites and over time from user agents.
> >  
> >  
> > A third party eligible for an audience measurement research permitted use MUST adhere to the
> >  
> >  
> > following restrictions. The data collected by the third party:
> >  
> >  
> > • Must be pseudonymized before statistical analysis begins, such that unique key-coded data are
> >  
> >  
> > used to distinguish one individual from another without identifying them, and
> >  
> >  
> > • Must not be shared with any other party unless the data are de-identified prior to sharing, and
> >  
> >  
> > • Must be deleted or de-identified as early as possible after the purpose of collection is met and in
> >  
> >  
> > no case shall such retention, prior to de-identification, exceed 53 weeks and
> >  
> >  
> > • Must not be used for any other independent purpose including changing an individual’s user
> >  
> >  
> > experience or building a profile for ad targeting purposes.
> >  
> >  
> > • In addition, the third party must be subject to an independent certification process under the
> >  
> >  
> > oversight of a generally-accepted market research industry organization that maintains a web
> >  
> >  
> > platform providing user information about audience measurement research. This web platform lists
> >  
> >  
> > the parties eligible to collect information under DNT standards and the audience measurement
> >  
> >  
> > research permitted use and it provides users with an opportunity to exclude their data contribution.
> >  
> >  
> >  
> >  
> >  
> >  
> > Non-normative: collection and use for audience measurement research
> >  
> >  
> > Audience measurement research creates statistical measures of the reach in relation to the total
> >  
> >  
> > online population, and frequency of exposure of the content to the online audience, including paid
> >  
> >  
> > components of web pages.
> >  
> >  
> > Audience measurement research for DNT purposes originates with opt-in panel output that is
> >  
> >  
> > calibrated by counting actual hits on tagged content on websites. The panel output is re-adjusted
> >  
> >  
> > using data collected from a broader online audience in order to ensure data produced from the panel
> >  
> >  
> > accurately represents the whole online audience.
> >  
> >  
> > This online data is collected on a first party and third party basis. This collection tracks the content
> >  
> >  
> > accessed by a device rather than involving the collection of a user’s browser history. Audience
> >  
> >  
> > measurement is centered around specific content, not around a user.
> >  
> >  
> > The collected data is retained for a given period for purposes of sample quality control, and
> >  
> >  
> > auditing. During this retention period contractual measures must be in place to limit access to, and
> >  
> >  
> > protect the data, as well as restrict the data from other uses. This retention period is set by auditing
> >  
> >  
> > bodies, after which the data must be de-identified.
> >  
> >  
> > The purposes of audience measurement research must be limited to:
> >  
> >  
> > · Facilitating online media valuation, planning and buying via accurate and reliable audience
> >  
> >  
> > measurement.
> >  
> >  
> > · Optimizing content and placement on an individual site.
> >  
> >  
> > The term “audience measurement research” does not include sales, promotional, or marketing
> >  
> >  
> > activities directed at a specific computer or device. Audience measurement data must be reported as
> >  
> >  
> > aggregated information such that no recipient is able to build commercial profiles about particular
> >  
> >  
> > individuals or devices.
> >  
> >  
> >  
> >  
> > Prof. Peter P. Swire
> > C. William O'Neill Professor of Law
> > Ohio State University
> > 240.994.4142 (tel:240.994.4142)
> > www.peterswire.net (http://www.peterswire.net)
> >  
> > Beginning August 2013:
> > Nancy J. and Lawrence P. Huang Professor
> > Law and Ethics Program
> > Scheller College of Business
> > Georgia Institute of Technology
> >  
> >  
> > From: Kathy Joe <kathy@esomar.org (mailto:kathy@esomar.org)>
> > Date: Wednesday, July 17, 2013 9:24 AM
> > To: "public-tracking@w3.org (mailto:public-tracking@w3.org)" <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> > Subject: ISSUE-25 re 5.2 Audience measurement: ACTION 415 June change proposal:
> > Resent-From: <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> > Resent-Date: Wednesday, July 17, 2013 9:25 AM
> >  
> > Dear All,
> >  
> > Over the last few weeks as agreed by the group, we have had several calls including Susan Israel, Richard Weaver, Adam Philips as well as Peter Swire with Rigo, Justin and Jeff - the wiki http://www.w3.org/wiki/Privacy/TPWG/Change_Proposal_Audience_Measurement is not yet updated to cover these exchanges including
> > The mail from Rigo withdrawing his suggestion following my note to the group with a clarification
> > The most recent submission following our call with Justin with additional wording on 'pseudonyized' . It also includes clarification on the purpose of AMR (see attached, text in red is new text not in the wiki version)
> > A note to Jeff Chester clarifying part of the non normative text
> >  
> > I attach the email string below in advance of our call later today
> >  
> > Best regards
> >  
> > Kathy Joe
> > > > >  
>  
>  
>  
> --  
> Edward W. Felten
> Professor of Computer Science and Public Affairs
> Director, Center for Information Technology Policy
> Princeton University                 
> 609-258-5906           http://www.cs.princeton.edu/~felten  
>  
>  
> Attachments:  
> - 23 July 2013 W3C answers.docx
>
Received on Wednesday, 24 July 2013 00:55:30 UTC