W3C home > Mailing lists > Public > public-tracking@w3.org > February 2012

Re: [Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track'

From: Roy T. Fielding <fielding@gbiv.com>
Date: Tue, 7 Feb 2012 15:22:26 -0800
Cc: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <FE1EA1F7-C098-4F91-ADC9-29492ADA1656@gbiv.com>
To: Lauren Gelman <gelman@blurryedge.com>
On Feb 6, 2012, at 9:17 PM, Lauren Gelman wrote:

> Can you give me an example of a 3rd party site that needs referer info for billing/audit/fraud?  

Any site that collects or expects money for impression-based advertising
and any third-party auditor that uses subrequests to independently monitor
page deliveries.

Keep in mind that a site doesn't know whether the current request is,
from the client perspective, a user-initiated first-party click, a
fraudulent "third-party" subrequest placed on some popular site's
unrelated page, or a valid ad placement that is merely counting
impressions.  The distinction between third-party and first-party
might be part of detecting fraud, but doesn't prevent fraudsters
from making third-party requests to an ad that was intended to be
first-party or from making third-party requests from sites other
than the one intended for the ad.

One of the simplest fraud detection techniques is to compare the referral
data sent in the request URI (which may include a hash or campaign id
that can be used to validate referral sites) with the URI present
in the Referer header field. 

Note that this is a trivial example of fraud detection.  There are
more sophisticated attacks in common use, that require more sophisticated
tracking to detect, and there is simply no way that the industry will
allow "DNT: 1" to be a let-me-do-anythng-I-want card, because then
every fraudster would send "DNT: 1".  Hence, the requirements have
to be on limiting use and time-limiting retention.

Another example is a first-party site that sells concert tickets
relying on multiple tracking mechanisms (both same-site and third-party)
to detect not only that the user is not a banned reseller but also
that the user is using a "normal" browser (not one that has been
specifically designed to grab and hold a set of tickets and then
re-market them on a secondary site until the hold expires).

> Referrer data is used to tell me where a user is coming from.  If I'm Macys and a DNT:1 user arrives on my site because they clicked on an ad on NYT.com then I am a first party.  I get to know referrer info and can credit NYT with the click.
> What is the use case where I'm a third party and I need to know where a user is coming from.  If I'm a Macys ad just sitting on NYT, and a DNT:1 user visits the site, why would referrer info [where the person was prior to arriving at NYT] be passed to me?

Sorry, that's me being unclear.  The referral data in the ad's case is
Macy's website, not where the user came from before Macy's.  It is
important to know that this ad was seen on Macy's site.

>  If I am an ad server, why do I need that info to do an audit?  They can't sell an ad into that spot based on where the user came from for a DNT:1 user, right?

I don't know if we've restricted that yet, but in any case I was talking
about the info necessary to confirm that this ad was placed on Macy's
site (and probably a specific page of Macy's site), independent of where
the user came from before.  Those ad placements are common.  Unfortunately,
the term "referral" has two distinct meanings there.  The ad industry probably
has a better term for the site of a premium ad placement.

>> We are already limiting data collection to the site operator
>> and data processors contracted by that site, but "site" in
>> that case includes third-party services.
> I am not sure what this means.  I thought "the site" and "third party services" were distinct entities (however they end up being defined).

This thread is about defining the requirements in terms of cross-site
data sharing *instead* of first/third-party distinctions.

Received on Tuesday, 7 February 2012 23:25:42 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:38:33 UTC