Re: ISSUE-25 on the agenda for the October 02 call from Nicholas Doty on 2013-10-16 (public-tracking@w3.org from October 2013)

From: Nicholas Doty <npdoty@w3.org>
Date: Tue, 15 Oct 2013 20:28:04 -0700
To: Rob Sherman <robsherman@fb.com>, Kathy Joe <kathy@esomar.org>
Cc: "Matthias Schunter (Intel Corporation)" <mts-std@schunter.org>, "(public-tracking@w3.org) Working Group" <public-tracking@w3.org>, "Edward W. Felten" <felten@CS.Princeton.EDU>, David Stark <david.stark@gfk.com>, Richard Weaver <rweaver@comscore.com>, Ronan Heffernan <ronan.heffernan@nielsen.com>, Elise Berkower <elise.berkower@nielsen.com>, George.Pappachen@kantar.com, Adam Phillips <adam.phillips@realresearch.co.uk>, Susan Israel <Susan_Israel@Comcast.com>
Message-Id: <2D3FB9C4-DBEB-4FCB-B5F6-22A6E92103E8@w3.org>
Based on Kathy's comments here and on last week's call, I've updated the change proposal to move the independent certification process text into the non-normative subsection. As non-normative sections don't introduce normative requirements (which we note with the terms "MUST", "SHOULD", "MAY" in all caps), I've modified the text to remove the "MUST":

> Parties conducting audience measurement might be subject to an independent certification process under the oversight of a generally-accepted market research industry organization that maintains a web platform providing user information about audience measurement research. This web platform lists the parties eligible to collect information under DNT standards and the audience measurement research permitted use and it provides users with an opportunity to exclude their data contribution.

I would also suggest that we do that for all of the non-normative section (if that section is indeed not intended to introduce additional requirements on implementers).

Thanks,
Nick

Reminder of the wiki link for this change proposal: http://www.w3.org/wiki/Privacy/TPWG/Change_Proposal_Audience_Measurement#Audience_Measurement_Permitted_Use

On October 6, 2013, at 8:55 PM, Rob Sherman <robsherman@fb.com> wrote:

> Kathy, thanks so much for your flexibility on this.  It does seem a bit unusual to call out a specific compliance regime in a W3C spec — particularly as we're not doing so anywhere else in our draft — but I take your point that the permitted use you are proposing is narrow and should be taken as such.
> 
> Perhaps the easiest way to approach this would be consistent with what we are doing with the other issues that are on the table:  remove the last bullet of your normative text as you suggest, and focus right now on normative text, then consider what if any non-normative text is needed once we've resolved normative.  When we get to non-normative text later, I'd certainly be open to describing a certification model as one example — but given where we are in the procedure I don't think we need to reach this question yet.
> 
> Rob Sherman
> Facebook | Manager, Privacy and Public Policy
> 1299 Pennsylvania Avenue, NW | Suite 800 | Washington, DC 20004
> office 202.370.5147 | mobile 202.257.3901
> 
> Subject: ISSUE-25 on the agenda for the October 02 call
> 
> Hi Rob,
>  
> Many thanks for your note.  
>  
> Whilst there might be a range of audience measurement techniques, Issue 25 is specifically in connection with calibrating data obtained via opted-in panels.
>  
> The key point is that since Issue 25 requires that only aggregated data be provided to clients, and that there is no release of PII collected for AMR for other purposes, we believe there needs to be independent oversight to check that companies claiming the AMR exemption are complying, with consistent application worldwide, also providing consumer information to provide an additional level of transparency and education for users.
>  
> We would be willing to move the paragraph on the ‘independent certification process’ to the non-normative section, especially as it was pointed out that the W3C standards do not include other compliance requirements. We also remain open to further discussion as the standard evolves in practice.
> Kathy Joe,
> Director, International Standards and Public Affairs
> 
> Subject: Re: ISSUE-25 on the agenda for the October 02 call
> 
> Kathy,
> 
> I apologize that I missed the call today and wasn't able to participate in the discussion, but I do have a question about the last point that Ed raised below:  I understand that AMR members have a particular framework in mind, but it seems most sensible to develop a permitted use for audience measurement that would apply to any party that wanted to engage in that practice, regardless of whether it was a member of a particular association or had a particular auditor.  Would you consider modifications to the proposal that would make an association membership/auditing component optional but that would enable other parties to comply even if they were not eligible to or chose not to join the association?  
> 
> I think this comes up most significantly in the last bullet of your normative text, but there may be aspects of the non-normative text that are helpful for explanation within this group as we decide on what is the right path forward but that so specifically describe particular companies' business models that they're less helpful in a specification.
> 
> Thanks.
> 
> Rob
> 
> Rob Sherman
> Facebook | Manager, Privacy and Public Policy
> 1299 Pennsylvania Avenue, NW | Suite 800 | Washington, DC 20004
> office 202.370.5147 | mobile 202.257.3901
> 
> 
> Hi there
>  
> We note that ISSUE-25 is on the agenda for today and wanted to provide the group with answers to Ed Felten’s questions:
>  
> http://www.w3.org/wiki/Privacy/TPWG/Change_Proposal_Audience_Measurement/Open_Questions
>  
> Regards
>  
> Kathy Joe
>  
> Ed Felton’s questions:
> 1.   What does "identifying" mean in this text? (One might read "without identifying" as requiring that data be "de-identified" according to the definition that appears elsewhere in the spec. But if the data qualifies as de-identified then no permitted use is required here because the general safe harbor for de-identified data already applies. Alternatively, if "identifying" means something different here, then that should be spelled out.)
> 2.   What does "unique key-coded data" mean? Is the text about "unique key-coded data ..." meant to serve as a definition of "pseudonymized"? If so, it seems overly prescriptive, requiring one particular method that (purportedly) qualifies as pseudonymized. Alternatively, this text might be read as requiring a particular (purported) pseudonymization method. If so, why require this particular method?
>  
> Answer: The controls regarding the census data include assigning a random number to the record and obfuscating the last three digits of the IP address. These are the current minimum requirements. Different companies may adopt further pseudonymization practices for technical reasons and these may change with technology or with national law eg in Germany it is required that the IP address is hashed as well.
> If there is future agreement at international level on pseudonymization standards or definition, we will adhere to these if they are higher than our standards as they become available. The census data is held securely, as is all audience research data, and deleted within the maximum time period for validation and auditing.
>  
> We note that the wording is open to misinterpretation because the data is pseudonomized during processing, and then aggregated (ie de-identified ) data is provided to clients as statistical reports. Therefore without specifying the method used for pseudonomization, alternative wording could describe a testable outcome:
>  
> CURRENT TEXT: The data collected by the third party:
> Must be pseudonymized before statistical analysis begins, such that unique key-coded data are used to distinguish one individual from another without identifying them.  
>  
> NEW PROPOSAL: The data collected by the third party:
> Must be pseudonymized before statistical analysis begins, such that it is possible to distinguish one individual from another but the data by itself, cannot be attributed to a specific device.  
>  
> Ed Felton’s question
> 3.   Why allow pseudonymization to be delayed until "statistical analysis begins"? Why not require pseudonymization to be done promptly when data is initially collected?
> Answer: This data first needs to be filtered on a continuous basis to detect fraudulent activity such as web bots. As the campaign progresses, you may detect additional doubtful elements and then need to re-process the data again to check that they are removed. Once it is certain that the data is clean, it is pseudonymized before analysis.
>  
> Ed Felton questions
> The "independent certification process under the oversight of a generally-accepted market research industry organization that maintains a web platform providing user information about audience measurement research. This web platform lists the parties eligible to collect information under DNT standards and the audience measurement research permitted use ..."
>  
> 4.   The authors appear to have a specific organization in mind. Which organization is  that, and who runs it?
> 5.      What is the rationale for giving a particular organization control over the certification process and the ability to declare who is eligible to exercise this permitted use?
>  
> Answer:The proposal for Issue 25 has been developed by the major global providers of AMR. This paper is intended to provide clarity about why the proposal has been written in the way it has and to help people who are not familiar with this kind of market research understand how our industry works to protect consumers’ personal information, ensure that advertising money is spent efficiently and encourage effective competition and good innovation by media publishers. We have tried to incorporate sufficient protections in the specification to provide reassurance to the members of W3C that this is in fact the case, but we remain willing to discuss further issues of clarification or amendment which will provide additional clarity and reassurance.
>  
> As noted, explanations and opt-outs are currently offered by AMR providers separately and there are various self-regulatory mechanisms already in place. The intention in Issue 25 is to provide an additional level of transparency and education for users, noting that this use case is not immediately apparent even for experts in this W3C group. We think that a common AMR explanation  and opt-out will help users understand the purpose, and ensure that this permitted use remains with the boundaries specified by the W3C standard. The body would be set up with the participating research companies as founder members with expert oversight and all companies operating in this field are welcome to join. We remain open to moving this into the non normative section of Issue 25 and further discussion as the standard evolves in practice.
>  
> Kathy Joe,
> Director, International Standards and Public Affairs
>
Received on Wednesday, 16 October 2013 03:28:18 UTC