Re: discussion on FLOC performance? from Deepak Ravichandran on 2021-02-18 (public-web-adv@w3.org from February 2021)

From: Deepak Ravichandran <deepakr@google.com>
Date: Thu, 18 Feb 2021 12:34:45 -0800
To: Arnaud Blanchard <arnaud.blanchard.87@gmail.com>
Cc: public-web-adv@w3.org
Message-ID: <CAHtKQycHBPrJO33BPvzFzbGXd4+ArByeYw2MshyWqo3Go9ofDw@mail.gmail.com>
Arnaud,

Thank you for your interest in our Floc experiment. I understand that you
have a lot of questions/comments regarding our work --  I am happy to
answer them in our next W3C meeting. Below is a response to some of your
questions.

The reported Floc experiments were conducted with the algorithms described
in the whitepaper listed below:

https://github.com/google/ads-privacy/raw/master/proposals/FLoC/FLOC-Whitepaper-Google.pdf

As stated in the whitepaper, the floc algorithms were designed with the
following principles in mind:

“

   1.

   The cohort id should prevent individual cross-site tracking.
   2.

   A cohort should consist of users with similar browsing behavior.
   3.

   Cohort assignments should be unsupervised algorithms, since each
   provider has their own optimization function.
   4.

   A cohort assignment algorithm should limit the use of “magic numbers”.
   That is, its parameter choice should be clearly and easily explained.
   5.

   Computing an individual’s cohort should be simple. This allows for it
   to be implemented in a browser with low system requirements.”


In other words, the algorithms are designed to be as generalized and widely
applicable as possible.

The algorithms were also tested for two publicly available data sets and
one proprietary Ads data set.

   1.

   Million Song Database
   2.

   MovieLens Database
   3.

   Google proprietary Ads


The algorithms described and tested in the above paper were used to run an
A/B test. As per the explainer
<https://github.com/google/ads-privacy/blob/master/proposals/FLoC/Floc-live-experiments.md>,
the 95% number was tested on Google’s audience taxonomy. While this 95%
number may be unique to Google Ads production system (as would be the case
with any A/B experiment) it’s worth re-emphasizing that the FloC algorithms
are general and unsupervised, and hence are not expected to perform
significantly better or worse for any specific vendor’s taxonomy. The
k-anonymity threshold enforced in the experiment required a minimum of 500
users minimum per cohort. It should be noted that this number was based on
constructing/simulating Flocs from Google Ads based 3P cookie system.

While some media reports omitted additional context , the original blog
post stated "Our tests of FLoC to reach in-market and affinity Google
Audiences show that advertisers can expect to see at least 95% of the
conversions per dollar spent when compared to [3P] cookie-based advertising."
In other words we were testing the effectiveness of FLoC for in-market and
affinity ad targeting within Google's ad network, as we expect other APIs
to be available to address other use cases.



We encourage more discussion regarding evaluation methodology. In addition,
Google Ads is planning to work with DSP partners with a number of promising
FLoC algorithms, via an experimental framework
<https://ads-developers.googleblog.com/2021/01/announcing-new-real-time-bidding.html>
.



Deepak

Google Ads


On Mon, Feb 15, 2021 at 2:01 AM Arnaud Blanchard <
arnaud.blanchard.87@gmail.com> wrote:

> Hi Wendy and group,
> We would like to put FLoC's performance level discussion to the agenda to ideally get:- More details about the test objectives (metrics; dimensions; etc.)
> - More technical details about the creation of the FLoC (FLoC size; FLoC assignment detailed methodology)
> - External communication clarification
> 2 sessions ago, there were many interesting points raised around Google's recent communication pieces about FLoC performance compared to 3rd cookies. In particular, the idea that FLoC would retain 95% of the performance brought by third party cookies seemed to draw particular attention from the community. Then, someone from Google Ads said that they would share the analysis this figure was drawn from with some details.
> If I am not mistaken, the only publication so far consists of this brief explainer:  https://github.com/google/ads-privacy/blob/master/proposals/FLoC/Floc-live-experiments.md. It does bring some clarifications, such as the fact that the experiment concerned a very narrow use case ('audience targeting' based on Google taxonomy), with some others explicitly out of scope (remarketing, others vendors taxonomies).
> However, despite Google's reassurance that the AB test was conducted in the best conditions, and with sound analytical methodology - which we have no reason to doubt - we still miss a lot of details that would allow everyone to understand how FLoC would impact their products (flock sizes, proprietary taxonomy impact, etc).
> Stating that 95% of the performance is preserved without stating the particular use case it was measured against implies that Google Chrome's FLoC will be considered as good as long as it allows emulation of Google Ads proprietary taxonomy. I hope this is not the case, and I assume this was not the intention of the analysis, but that what it looks like.
> All in all, this 95% number is an overstatement that does convey a misleading idea to the public. In my opinion, the whole group, and the FLoC project itself, would seriously benefit from a broader, more detailed clarification.
> Thank you very much in advance,
> Arnaud Blanchard
>
>
Received on Thursday, 18 February 2021 20:35:13 UTC