Re: issue-25 from Ronan Heffernan on 2013-07-22 (public-tracking@w3.org from July 2013)

From: Ronan Heffernan <ronansan@gmail.com>
Date: Mon, 22 Jul 2013 07:18:13 -0400
To: "Mike O'Neill" <michael.oneill@baycloud.com>
Cc: Tracking Protection Working Group WG <public-tracking@w3.org>
Message-ID: <CAHyiW9L0L_PqcLAp6HZaU-2X4iAE6QHY_sK==zLbWoP2u8+V6g@mail.gmail.com>
Mike,
   There can't be a Last-Modified header except for objects that the
browser has cached.  If we disable the mechanisms that prevent caching
(e.g. cache-buster parms), then proxies (corporate, ISP, etc.) will be able
to serve the pixels.  You are also depending on the browsers to still have
that object in their cache, otherwise we will be counting the user multiple
times.  This also has to work with mobile browsers and embedded user
agents, which may have tiny caches (or no cache), and which often have
unusual behavior around LastModified, ETag, and/or caches surviving a
power-cycle.

   This mechanism also only works if the pixel has the exact same URL.
That means that you can't point to the same pixel from multiple ad networks
or ad servers with different macro values (yes, this is done today and will
be required going forward), and you can't use both HTTP and HTTPS.  You
also can't provide unique counts for a campaign and a creative, since you
will only be able to pass one of those IDs or the other without getting
multiple over-counts.

   One of the services that this kind of measurement provides is a
double-check that frequency-capping rules are working.  This service
requires knowing how many times each device was exposed to an ad and/or
campaign.  If the ad contract calls for a 7-times frequency cap on each
creative and a 14-times frequency cap on the whole (6 creative) campaign,
on a 6-month campaign, then the advertiser wants independent verification
that that clause was honored.  That requires having counts per device, not
just "Hey, I think you're new."

   This is an interesting idea, and if industry can find ways to
work-around the problems, and can prove that this mechanism works better,
then you can expect that they will.  Until that is able to happen, there is
no way to sign-off on such an unproven mechanism, especially in light of
the apparent problems.

--ronan





On Mon, Jul 22, 2013 at 4:20 AM, Mike O'Neill
<michael.oneill@baycloud.com>wrote:

> Ronan,****
>
> ** **
>
> You keep your own counts for ad impressions, they would not need to be
> held in the browser, and it is up to the server how caching is
> implemented.  The ETag counts would be useful for frequency capping, though
> you may not need to do that.****
>
> ** **
>
> Here is a demo of unique visitor detection (and low-entropy user-agent
> specific counting) using cache headers I have knocked together. There is no
> sharing/use of persistent unique identifiers, and no cookies.****
>
> ** **
>
> A new visit is communicated to the server by letting it know the time of
> the previous visit so the server can update its aggregated counts. There is
> no need to communicate a unique identifier so the individual cannot be
> tracked.****
>
> ** **
>
> http://cloudclinic.com/HiImDory****
>
> ** **
>
> The duration between unique visits is set to 10 seconds to keep it simple,
> but of course it could be any length of time.****
>
> ** **
>
> ** **
>
> Mike****
>
> ** **
>
> ** **
>
> *From:* Ronan Heffernan [mailto:ronansan@gmail.com]
> *Sent:* 19 July 2013 22:02
> *To:* Mike O'Neill; Tracking Protection Working Group WG
> *Subject:* Re: issue-25****
>
> ** **
>
> Mike, ****
>
> Having the top-level page in the URL parameters means that the
> incrementing ETag won't be incremented if the same ad is seen across sites,
> since the browsers will cache a separate copy for each site (since nearly
> all caching is URL-based)?  BTW, we also issue a no-cache header, and add a
> cache-busting parameter, to keep the browsers and intermediate proxies from
> supplying the pixels out of cache.  These are standard practices.  That
> means that the browsers should not be caching the pixels or the current
> ETag value, so there will be no value for the webservers to increment.****
>
> If you are right that these techniques offer a superior way to function in
> an environment that involves 3rd-party cookie blocking, then industry will
> adopt them even with the audience measurement permitted use exception.
> However, they do not seem to provide the functionality that we need.****
>
> --ronan****
>
> ** **
>
> ** **
>
> On Fri, Jul 19, 2013 at 3:45 PM, Mike O'Neill <michael.oneill@baycloud.com>
> wrote:****
>
> Hi Ronan,****
>
>  ****
>
> If you read my first example I had a third-party element addressed via a
> URI containing the first-party page in a query parameter, so making the
> exchange top level page specific. It is not very complicated and it works.
> ****
>
>  ****
>
> In addition to being DNT compliant it also avoids the default third-party
> cookie blocks which are becoming very common.****
>
>  ****
>
> Mike****
>
>  ****
>
>  ****
>
> *From:* Ronan Heffernan [mailto:ronansan@gmail.com]
> *Sent:* 19 July 2013 20:02****
>
>
> *To:* Mike O'Neill
> *Cc:* Tracking Protection Working Group WG
> *Subject:* Re: issue-25****
>
>  ****
>
> Mike, ****
>
>    I call this improper because ETags already have a purpose and
> semantics.  If I understand you correctly, we would have to use the exact
> same URL, so that the browser would use the ETag value that it cached.
> This means that we could no longer use "cache-busting" parameters, which
> means that intermediary proxies could serve the content, which destroys
> audience measurement.  I understand the desire for really complicated,
> unproven, solutions, but none of the ones that I have heard so far seem
> likely to work.  We have a solution that works, and is well proven.****
>
> --ronan
>   ****
>
>  ****
>
> On Fri, Jul 19, 2013 at 12:30 PM, Mike O'Neill <
> michael.oneill@baycloud.com> wrote:****
>
> Hi Ronan,****
>
>  ****
>
> No, not a unique identifier, which I agree would diminish privacy and
> should be ruled out along with any other tracking identifier collection
> when DNT is 1. What I meant was a count value (number of ad impressions)
> which I assume would have limited entropy i.e. the max value would be <<
> the number of online individuals in scope. How many ad impressions would
> you need to count? I agree relying on the cache for 6 months would be a
> stretch, but do you need to do that? At some point there may be some loss
> of functionality when DNT is 1 but the setting is an important indication
> of user intent so needs to be honoured.****
>
>  ****
>
> How an ETag is generated in not specified in the HTTP spec, so in what way
> would this be “improper”?****
>
>  ****
>
>  ****
>
> Mike****
>
> . ****
>
>  ****
>
>  ****
>
> ** **
>
Received on Monday, 22 July 2013 11:19:02 UTC