Re: Frequency Capping from Tamir Israel on 2012-07-13 (public-tracking@w3.org from July 2012)

From: Tamir Israel <tisrael@cippic.ca>
Date: Fri, 13 Jul 2012 12:03:46 -0400
To: Peter Cranstone <peter.cranstone@3pmobile.com>
CC: Chris Mejia <chris.mejia@iab.net>, Peter Eckersley <peter.eckersley@gmail.com>, Jonathan Mayer <jmayer@stanford.edu>, "Grimmelmann, James" <James.Grimmelmann@nyls.edu>, W3C DNT Working Group Mailing List <public-tracking@w3.org>, Mike Zaneis <mike@iab.net>, Brendan Riordan-Butterworth <Brendan@iab.net>
Message-ID: <500046E2.8090503@cippic.ca>
The definition I am working with here (conceptually, yes no consensus 
yet, I know) is tracking = third party collection of a unique identifier 
across more than one site.

On 7/12/2012 4:33 PM, Peter Cranstone wrote:
> Chris,
>
> What is your definition of tracking? And how does that align or not 
> with the current DNT spec. Once we have a definition we can all agree 
> on then these discussions e.g. f-capping and exceptions can be solved.
>
> Without an agreed upon definition we just end up in endless debates.
>
>
>
> Peter
> _________________________
> Peter J. Cranstone
> CEO.  3PMobile
> Boulder, CO  USA
>
>
> /Improving the Mobile Web Experience/
> /
> /
> Cell: 720.663.1752
> Skype: cranstone
> www.3pmobile.com <http://www.3pmobile.com/>
>
>
> From: Chris Mejia <chris.mejia@iab.net <mailto:chris.mejia@iab.net>>
> Date: Thursday, July 12, 2012 12:35 PM
> To: Tamir Israel <tisrael@cippic.ca <mailto:tisrael@cippic.ca>>
> Cc: Peter Eckersley <peter.eckersley@gmail.com 
> <mailto:peter.eckersley@gmail.com>>, Jonathan Mayer 
> <jmayer@stanford.edu <mailto:jmayer@stanford.edu>>, "Grimmelmann, 
> James" <James.Grimmelmann@nyls.edu 
> <mailto:James.Grimmelmann@nyls.edu>>, W3C DNT Working Group Mailing 
> List <public-tracking@w3.org <mailto:public-tracking@w3.org>>, Mike 
> Zaneis <mike@iab.net <mailto:mike@iab.net>>, Brendan 
> Riordan-Butterworth <Brendan@iab.net <mailto:Brendan@iab.net>>
> Subject: Re: Frequency Capping
> Resent-From: <public-tracking@w3.org <mailto:public-tracking@w3.org>>
> Resent-Date: Thursday, July 12, 2012 12:36 PM
>
>     CM:  Adding additional useful context regarding why the
>     alternative f-capping methods Prof. Felton proposed in his FTC
>     blog are not technically feasible, my colleague in the IAB's
>     Advertising Technology group has also weighed in
>     (http://techatftc.wordpress.com/2012/07/03/privacy-by-design-frequency-capping/)
>     and I am sharing his technical comments below.
>
>     Chris Mejia | Digital Supply Chain Solutions | Ad Technology Group
>     | Interactive Advertising Bureau - IAB
>
>         Brendan Riordan-Butterworth <http://www.iab.net/>
>         July 11, 2012 at 4:26 pm
>         <http://techatftc.wordpress.com/2012/07/03/privacy-by-design-frequency-capping/#comment-160>
>
>         Last week I shared my thoughts about your post with my peers
>         here at the IAB, and was today urged to share with this larger
>         audience.
>
>         Your initial thoughts on moving information storage to the
>         client cookie jar are incorrect. Client-side frequency capping
>         doesn’t fail “because ad placement decisions are normally made
>         on the ad network’s servers but the frequency information will
>         now be stored elsewhere,” it fails because client-side storage
>         of the frequency state of multiple campaigns has the potential
>         of overloading the cookie header, and because updating client
>         state can be blocked in 3rd party scenarios. Having the
>         frequency state on the client (and therefore on the inbound
>         HTTP request) can actually make frequency capping easier on
>         the server side, because you don’t have to propagate this
>         state across all the physical ad servers.
>
>         With regards to the “second way”, tech may have changed, but
>         in my experience profitable ad servers record the minimum
>         required information in order to bill – recording something
>         like the HTTP REFERER (IE, what page the ad was delivered
>         onto) can increase log record size several times over,
>         significantly increasing hardware costs and data processing
>         latency. That said, recording the publisher ID or campaign ID
>         (IE, what site/network is supposed to be delivering the ad) is
>         standard practice, since you need to know who to pay. As Brian
>         O’Kelley has indicated, minimizing the set of data stored is
>         still sensible design for performance.
>
>         Implied in Brian’s comment is that there’s a “frequency”
>         record for every userID – the data structures for targeting in
>         current systems use user pseudonyms as primary keys for
>         targeting data, including frequency capping. Moving to a hash
>         of userID/CampaignID for storing frequency capping multiplies
>         the number of records in the data structure storing
>         information by the number of campaigns (and possibly the
>         number of advertisers), thereby incurring additional cost in
>         storage and look up time, in addition to the (minor) hashing
>         cost of inbound userID/CampaignID.
>
>         I think the suggestion of using bloom filters for storing
>         frequency capping assumes that there is an absolute maximum
>         number of times a specific userID can be shown an ad – a
>         strictly additive situation. However, as Brian pointed out,
>         the capping happens at intervals less than the campaign
>         duration – and the intervals are user specific. You could
>         implement by recording whether UserID interacted with
>         CampaignID during arbitrary intervals, but doing so would
>         generate either a significant increase of items to store, or
>         the additional complexity of maintaining time sequenced Bloom
>         filters, and either implementation loses out on some
>         granularity of timing.
>
>
>     Tamir Israel wrote:
>
>     /"Now -- if you're saying there is a good reason to collect here
>     because the costs of doing otherwise are exponential and the
>     benefits minimal, that is a discussion we can engage in
>     meaningfully. But we seem to be unable to get to that step."/
>
>
>     CM:  We have always maintained there is a good reason to do
>     frequency capping.  There are critical technical issues with the
>     alternatives proposed—I believe we have addressed those now in
>     this forum/thread.
>
>     /"This is not an unusual definition of 'privacy harm'. In fact,
>     the basis of most privacy protective regimes since the OECD
>     guidelines and CoE Convention 108 has been to minimize collection
>     to what is necessary."/
>
>     CM:  I take some exception to your rather loose definition
>     (through inference) of the word "harm".  When I look up the word
>     harm in the Merriam-Webster dictionary (provided free of charge
>     online now, and advertising supported:
>     http://www.merriam-webster.com/dictionary/harm), I found the
>     following definition consistent with a common understanding of the
>     term:
>
>
>         Definition of /HARM/
>
>     1
>     *:* physical or mental damage *:* injury
>     <http://www.merriam-webster.com/dictionary/injury>
>     2
>     *:* mischief <http://www.merriam-webster.com/dictionary/mischief>,
>     hurt <http://www.merriam-webster.com/dictionary/hurt>
>
>     I fail to see where the collection of information for the purpose
>     of frequency capping the delivery of a single ad creative to a
>     user causes users "physical or mental damage" or "injury" (context
>      for examination should be the reasonable interpretation of the
>     term based on its definition).  I also fail to see where the
>     companies that engage in this user-friendly practice are engaging
>     in "mischief" or causing "hurt".  So where is the actual harm?  I
>     don't believe there is any harm being done to a user or their
>     privacy through this practice.  On the contrary, it doesn't seem
>     unreasonable to claim that some harm can be done when users are
>     delivered the same ads over and over again, indiscriminately.  And
>     when users lose access to content all together, or have to pay to
>     access content that was previously available free of charge based
>     on an poorly implemented DNT mechanism, because publishers can no
>     longer afford to provide advertising-supported content or must
>     charge for content (i.e. pay walls), clearly harm will have been
>     done— but this harm will have been caused by an irresponsible DNT
>     mechanism.
>
>     Regarding your comment about data minimization, with respect to
>     f-capping, I believe the practice of data minimization for
>     f-capping is already the industry practice (and I'd like to see
>     any actual data to the contrary), based simply on costs
>     (latency/storage/monetary) associated with the practice (as
>     outlined above by Brendan).
>
>     From: Tamir Israel <tisrael@cippic.ca <mailto:tisrael@cippic.ca>>
>     Date: Thu, 12 Jul 2012 12:36:38 -0400
>     To: Chris Mejia - IAB <chris.mejia@iab.net
>     <mailto:chris.mejia@iab.net>>
>     Cc: Peter Eckersley <peter.eckersley@gmail.com
>     <mailto:peter.eckersley@gmail.com>>, Jonathan Mayer
>     <jmayer@stanford.edu <mailto:jmayer@stanford.edu>>, "Grimmelmann,
>     James" <James.Grimmelmann@nyls.edu
>     <mailto:James.Grimmelmann@nyls.edu>>, W3C DNT Working Group
>     Mailing List <public-tracking@w3.org
>     <mailto:public-tracking@w3.org>>, Mike Zaneis - IAB <mike@iab.net
>     <mailto:mike@iab.net>>, Brendan Riordan-Butterworth - IAB
>     <brendan@iab.net <mailto:brendan@iab.net>>
>     Subject: Re: Frequency Capping
>
>     On 7/12/2012 12:12 PM, Chris Mejia wrote:
>
>>     I think what Peter is referring to is that some users might view
>>     the very fact that they are being tracked in order to facilitate
>>     the advertising activities of many random third parties they have
>>     never interacted with to be a 'privacy harm'. As I noted
>>     previously, we can start debating the relative costs/benefits of
>>     an F-cap approach that is more privacy protective, if only
>>     someone from industry were willing to provide a sense of how
>>     privacy-friendly F-capping can be done.
>>
>>             CM:  Please see Brian O'Kelley's description of f-capping
>>             pasted below.  In my experience, this description closely
>>             describes the most common practice for f-capping.
>>
>>
>>     So far, I have not seen this, nor have I seen any direct
>>     substantive responses to why the alternative F-capping proposals
>>     suggested by some are not workable. A good faith attempt to
>>     resolve a problem would entail these very engineers that you are
>>     referring to  engaging, in good faith, in attempts to solve what
>>     is, at first instance, a technical problem.
>>
>>             CM:  In fact this thread started as a result of Prof. Ed
>>             Felton's FTC blog post
>>             (http://techatftc.wordpress.com/2012/07/03/privacy-by-design-frequency-capping/).
>>              David Wainberg called our attention to Brian O'Kelley's
>>             comments posted to Prof. Felton's blog.  Brian O'Kelley
>>             is the founder and CEO of AppNexus (he was also a founder
>>             at RightMedia) and is one of the foremost advertising
>>             technology engineers in the industry.  Brian's comments
>>             directly refuted the methods outlined by Prof. Felton,
>>             based in large part on severe performance issues
>>             (unacceptable ad serving performance that would
>>             negatively increase page load times) and scale issues.
>>             Jonathan Meyers then challenged Brian's critique of Prof
>>             Felton's Blog, but here on the W3C forum (2nd post in
>>             this thread I believe).  Since I realize that Brian's
>>             comments were never brought directly into this forum, I'm
>>             repasting them here now:
>>
>     OK, thanks Chris, I understand better where you're coming from
>     now. I'd say, to start, that I don't think Brian O'Kelley's method
>     is the standard. I took it as something AppNexus does that is
>     somewhat more unique (someone can please correct me if I am
>     wrong). But regardless, the problem with your question (provide
>     some evidence that servers are using the unique ID from F-capping
>     in order to connect track user browsing) is that, of course, there
>     is absolutely no way to do this since it happens invisibly on the
>     server.
>
>     The majority of users might trust many online advertisers not to
>     do this kind of thing, but there are now /so many/ advertisers out
>     there accessing unique IDs at each and every site a user visits,
>     that the best way to monitor 'no tracking' is to prevent
>     collection of unique identifiers by untrusted third parties (as
>     opposed to trying to prevent server-side correlation once
>     collection has occurred).
>
>     This is not an unusual definition of 'privacy harm'. In fact, the
>     basis of most privacy protective regimes since the OECD guidelines
>     and CoE Convention 108 has been to minimize collection to what is
>     necessary.
>
>     Now -- if you're saying there is a good reason to collect here
>     because the costs of doing otherwise are exponential and the
>     benefits minimal, that is a discussion we can engage in
>     meaningfully. But we seem to be unable to get to that step.
>>
>>
>>
>>>     Finally, please pardon my ignorance (as I don't know you); what
>>>     organization and constituency do you represent?  You haven't
>>>     provided a signature line indicating your affiliation and you
>>>     are writing to this forum from a gmail address, so I was not
>>>     able to ascertain your affiliation, if any, from this
>>>     information.  In the interest of full disclosure, I represent
>>>     the membership of the Interactive Advertising Association (IAB –
>>>     www.iab.net) where I work in the Advertising Technology Group
>>>     with industry engineers and operations professionals on
>>>     technical specifications, technical protocols and technical
>>>     guidance.
>>     With respect, Chris, I don't think this is productive. If it
>>     really is helpful to start throwing around credentials, I will
>>     say that CIPPIC (the public interest NGO I represent) is
>>     supportive on this point of the Standford (Jonathan)/EFF
>>     (Peter)/  Mozilla (Tom) compromise proposal which was presented
>>     to the group here a few weeks back and which did not include any
>>     explicit exception for tracking users for the purpose of F-caps.
>>
>>             CM:  Tamir, when making my request to understand Peter's
>>             affiliation, I did ask that he "/please pardon my
>>             ignorance/"; this was sincere.  I don't know Peter and I
>>             honestly did not understand his affiliation— as you might
>>             appreciate, operating in this political circle is not my
>>             usual job (I'm a technologist, not a politician, so
>>             again, please pardon my ignorance with respect to your
>>             world).  I was asking as a point of clarification, so I
>>             could further appreciate his POV.  Understanding where
>>             someone comes from allows me to better understand the
>>             context of their position.  I also take some offense with
>>             the notion that I was "/throwing around credentials/"; I
>>             was simply stating my own affiliation, out of respect, as
>>             I had requested the same of Peter.  Please don't turn
>>             this into a political you guys vs. us guys thing— I don't
>>             find that productive at all.  I'd rather focus our debate
>>             on the merits of all particular arguments being
>>             presented. (BTW- your affiliation was clear from your
>>             email address)
>>
>     OK. My bad. We can all blame gmail now : )
>>
>>
>>     Best,
>>     Tamir
>>
Received on Friday, 13 July 2012 16:04:44 UTC