Re: Frequency Capping from Jonathan Mayer on 2012-07-13 (public-tracking@w3.org from July 2012)

From: Jonathan Mayer <jmayer@stanford.edu>
Date: Fri, 13 Jul 2012 13:39:35 -0700
To: Brian O'Kelley <bokelley@appnexus.com>
Cc: Chris Mejia <chris.mejia@iab.net>, David Wainberg - NAI <david@networkadvertising.org>, W3C DNT Working Group Mailing List <public-tracking@w3.org>, Brendan Riordan-Butterworth <Brendan@iab.net>, Mike Zaneis <mike@iab.net>
Message-ID: <8652E473B61342198FCC6D1FB4ADF7F4@gmail.com>
On Friday, July 13, 2012 at 4:50 AM, Brian O'Kelley wrote:

> Jonathan,
>  
> Could you say a bit more about the potential costs of implementing a different approach to frequency capping?
>  
> BOK: Right now we hit our userdata store once per impression and get all the historical user data. Imagine we have to evaluate the top 10 campaigns after we sort by price - that's 30 queries (campaign cap + creative cap + advertiser cap) x 10. That would imply 30x the servers in the userdata cluster - right now we have ~30 servers, so you're talking about adding almost 1000 servers with no functionality gain. Overall, that takes us from ~2500 servers to ~3500 servers. It's material.
There's no doubt that the hashing approach Ed detailed imposes greater database load.  The client-side approach I favor, on the other hand, would reduce database size and load by distributing storage and lookups to web browsers.  (I would add that I don't think the hashing approach provides that much privacy.  It only addresses the (uncertain) risks associated with a user's impression history.  It doesn't mitigate the risks associated with the presence of an ID cookie.  Furthermore, it doesn't do much once a user ID or set of user IDs is known since hash computations have become so fast and cheap.)

As for the requirements imposed by the hashing approach: they'll vary by implementation.  If a database has to handle n times more queries per unit time, that certainly doesn't mean it needs n times more servers or will cost n times as much.  Choices about processing and storage hardware, the database solution, and the database schema will have a tremendous impact.  Some (relatively) recent stress testing by Netflix suggests Apache Cassandra NoSQL could be a promising direction (http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html).
> > However, one of the core tenets of performance advertising is the use of frequency data to predict response. It's part of the holy trinity (creative, placement, frequency) that dictates the success of direct response advertising. For performance advertising to work we need to lookup frequency before we figure out the price, and that can't happen the reordered algorithm you suggest.
> >  
> >  
> >  
>  
> Ok, so how about this algorithm:
>  
> 1) Begin with the set of all campaigns.
>  
> 2) Filter by targeting criteria.
>  
> 3) Assign an expected revenue to each campaign, without knowing actual impression frequency counts.  An implementation might use a trivial heuristic (e.g. assume no impressions or average impression frequency) or something more sophisticated (e.g. use a probability distribution of impression frequency).
>  
> 4) Select the n campaigns with greatest expected revenue.
>  
> 5) Filter by frequency capping.
>  
> 6) Assign a new expected revenue to each campaign using actual impression frequency counts.
>  
> 7) Select the campaign with greatest expected revenue.
>  
> Or, even better, maximize expected revenue across the n candidate campaigns:
>  
> 1) Begin with the set of all campaigns.
>  
> 2) Filter by targeting criteria.
>  
> 3) Construct a portfolio of n campaigns that maximizes portfolio-wide expected revenue, that is, the greatest individual campaign revenue.  An implementation might use trivial heuristics (e.g. choose n/3 ads assuming no impressions, n/3 ads assuming average impressions, and n/3 ads assuming loads of impressions) or something more sophisticated (e.g. use a probability distribution of impression frequencies).
>  
> 4) Filter by frequency capping.
>  
> 5) Assign a new expected revenue to each campaign using actual impression frequency counts.
>  
> 6) Select the campaign with greatest expected revenue.
>  
> First, I think it's easy to figure out statistically whether your algorithm produces an optimal outcome. Given 10,000 eligible campaigns and a bell curve distribution of the revenue each pays, sample n campaigns. I'm not a stats guy but I'm sure you could figure out what n would need to be to give you a decent chance that the best campaign is in there. I'm assuming you take the n highest prices pre-modifier as you suggest. The multiplier on frequency/recency has high variance - probably ranges from 0.1 to 3.0 - so we'd have to factor that in. As with my answer above, the higher n is, the higher the cost - I think in this case it's probably prohibitive if n is much more than 10.
The algorithm isn't guaranteed to select the campaign with greatest expected revenue.  But that seems just fine, so long as the marginal impact on total revenue isn't significant.  In other words, the algorithm should be acceptable if it usually picks a top or very-close-to-top campaign.
> > Note that from a privacy perspective, you should be very pleased that our optimization algorithm doesn't use any historical user data or behavioral segmentation (though our clients could explicitly target their own behavioral data outside the algorithm).
> >  
> >  
> >  
>  
> With the caveat that other participants assuredly disagree: I have no objection to behavioral targeting.  I and other researchers have, in fact, worked to develop practical, privacy-preserving approaches (see http://33bits.org/2012/06/11/tracking-not-required-behavioral-targeting/).  My concern is the collection of a user's browsing history by an organization he or she has no relationship with.  It seems totally backwards to me to get rid of a practice that might have some marginal economic value (behavioral targeting) and keep the practice that imposes serious privacy risks (ID cookies and equivalents).
> > I'd be happy to answer any other questions - this is a personal passion of mine!
> >  
> > Brian
> >  
> >  
> > From: Chris Mejia <chris.mejia@iab.net (mailto:chris.mejia@iab.net)>
> > To: Jonathan Mayer <jmayer@stanford.edu (mailto:jmayer@stanford.edu)>, David Wainberg - NAI <david@networkadvertising.org (mailto:david@networkadvertising.org)>
> > Cc: W3C DNT Working Group Mailing List <public-tracking@w3.org (mailto:public-tracking@w3.org)>, Brendan Riordan-Butterworth <Brendan@iab.net (mailto:Brendan@iab.net)>, Mike Zaneis <mike@iab.net (mailto:mike@iab.net)>, Brian O'Kelley <bokelley@appnexus.com (mailto:bokelley@appnexus.com)>
> > Subject: Re: Frequency Capping
> >  
> > Jonathan,
> >  
> > Frequency capping (f-capping) is usually a contractual obligation for the party responsible for delivering the ad (an ad-netork, a publisher, and exchange, etc.) and is almost always required by the advertiser in insertion orders (the insertion order or "IO" is the contract between the parties).  It looks like your assumption below is that f-capping is (only) a 'tactic' to increase ROI for performance campaigns.  While this is sometimes true (yet mostly not), it's actually rarely the real motivation of doing f-capping.  The requirement for f-capping the delivery of a campaign to users is generally contractually obligated by the advertiser, for several good reasons, but most importantly for not annoying the user with multiple servings of the same ad creative, over and over again in one time frame (i.e. in a 24-hour time period).
> >  
> > As f-capping is generally contractually obligated, it's not up to the deliverer of the ad to CHOOSE which campaigns to f-cap— it's a REQUIREMENT to f-cap all campaigns where contractually obligated to do so.  F-capping has happened in television advertising for many years— imagine how annoying it is when the same tv ad spot plays over and over again (in fact this happens, and I'm sure we all find it annoying).
> >  
> > To sum up, while f-capping can sometimes increase ROI for advertisers (it's not necessarily always true), it is most often contractually obligated (per the Insertion Order).  The primary motivation for f-capping is to not annoy the user with repeated serving of the same ad creative during a time period.  In my experience, the vast majority of f-capping is  set at 1:24 or 2:24, etc. (restricting the showing of a particular ad creative, 1 time in 24-hours, or 2-times in 24-hours).
> >  
> > I hope this helps clarify the motivation for f-capping and leads to mutual appreciation for the need.
> >  
> > Kind Regards,
> >  
> > Chris
> >  
> > Chris Mejia | Digital Supply Chain Solutions | Ad Technology Group | Interactive Advertising Bureau - IAB
> >  
> >  
> > From: Jonathan Mayer <jmayer@stanford.edu (mailto:jmayer@stanford.edu)>
> > Date: Tue, 10 Jul 2012 14:26:12 -0700
> > To: David Wainberg - NAI <david@networkadvertising.org (mailto:david@networkadvertising.org)>
> > Cc: W3C DNT Working Group Mailing List <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> > Subject: Re: Frequency Capping
> > Resent-From: W3C DNT Working Group Mailing List <public-tracking@w3.org (mailto:public-tracking@w3.org)>
> > Resent-Date: Tue, 10 Jul 2012 21:26:46 +0000
> >  
> > I'd sure like to hear more from advertising industry participants about how frequency capping integrates into advertisement selection.  The AppNexus approach, if I read correctly, goes roughly as follows:
> >  
> > 1) Begin with the set of all campaigns.
> >  
> > 2) Filter by targeting criteria.
> >  
> > 3) Filter by frequency capping.
> >  
> > 4) Assign an expected revenue to each campaign.
> >  
> > 5) Select the campaign with greatest expected revenue.
> >  
> > The approach includes testing the frequency cap of every campaign that matches targeting criteria.  What about, instead, only testing the cap for a subset of those campaigns:
> >  
> > 1) Begin with the set of all campaigns.
> >  
> > 2) Filter by targeting criteria.
> >  
> > 3) Assign an expected revenue to each campaign.
> >  
> > 4) Select the n campaigns with greatest expected revenue.
> >  
> > 5) Filter by frequency capping.
> >  
> > 6) Select the campaign with greatest expected revenue.
> >  
> > Some relevant empirical questions include: How often are the highest revenue campaigns frequency capped?  How well can an ad company predict which high-revenue campaigns will and won't be frequency capped?
> >  
> > Jonathan
> >  
> >  
> > On Monday, July 9, 2012 at 11:34 AM, David Wainberg wrote:
> >  
> > > Hi All,
> > >  
> > > In case you haven't seen it already, I recommend Prof. Felten's excellent blog on "Privacy by Design: Frequency Capping." Please also read Brian O'Kelley's post in the comment section explaining what he sees as the technical hurdles for these alternative frequency capping methods. (I may be wrong, but I think Brian is a former student of Prof. Felten.) This kind of detailed technical discussion of these proposals seems very helpful. First, it helps us set reasonable expectations on all sides. Second, and more interesting to me, is that maybe we can have more discussion and collaboration on bringing these sorts of things to production.  
> > >  
> > > http://techatftc.wordpress.com/2012/07/03/privacy-by-design-frequency-capping/
> > >  
> > > -David
> >  
>
Received on Friday, 13 July 2012 20:40:03 UTC