W3C home > Mailing lists > Public > public-tracking@w3.org > February 2012

RE: [Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track'

From: Shane Wiley <wileys@yahoo-inc.com>
Date: Thu, 9 Feb 2012 13:12:18 -0800
To: David Singer <singer@apple.com>
CC: Lauren Gelman <gelman@blurryedge.com>, "Roy T. Fielding" <fielding@gbiv.com>, John Simpson <john@consumerwatchdog.org>, "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-ID: <63294A1959410048A33AEE161379C8023D0C8ACD75@SP2-EX07VS02.ds.corp.yahoo.com>

You hit the core issue straight away - raw data retention and dissemination to service operational purpose exceptions.  While the end-points I described below do result in aggregate and anonymous outcomes, we need to retain the raw data to build those results (different intervals: real-time, daily, weekly, monthly, quarterly, annual).  In some cases we compound aggregation (build annual aggregations from quarterly aggregations) where we can without compromising the data but there are reporting uses where the raw data is retained for a more accurate result (financial reporting, for example).

- Shane

-----Original Message-----
From: David Singer [mailto:singer@apple.com] 
Sent: Thursday, February 09, 2012 11:44 AM
To: Shane Wiley
Cc: Lauren Gelman; Roy T. Fielding; John Simpson; public-tracking@w3.org (public-tracking@w3.org)
Subject: Re: [Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track'


thanks, this helps.

On Feb 8, 2012, at 22:59 , Shane Wiley wrote:

> Lauren,
> 3rd parties, such as ad networks, are required to retain referrer information to demonstrate "Ad Quality" placement to advertisers.  This supports multiple critical business operations:
> - Placement Accuracy:  Advertisers are paying the ad network to only show their ads on respectable publishers and want confirmation this is occurring (this data is typically provide to advertisers in an aggregate report - aggregated on the domain dimension with impression/click counts - no individual user data)

But a record that says "the dishwasher ad was shown on the sears website at 2:42am" doesn't identify a user at all, so it's out of our scope.  Even if it goes on to say "to a person in <audience bucket X>" it probably doesn't (e.g. a white female in her 30s living in the western states).

> - Fraud:  Some publishers attempt to hide ad displays and/or participate in multiple ad networks simultaneously to increase revenue but not impact the user experience.  The referrer helps ad networks highlight suspicious publishers for further investigation to determine if they are engaging in these activities.

again, doesn't seem like a link to a user is needed, is it?

If the answer is "it's simpler to write one log record and tease apart the various data uses later", then I understand.  But we probably need to understand how practical "immediate decomposition" is, and how "immediate", "immediate" is.

> I hope this helps.

surely does.  I appreciate it.

> - Shane
> -----Original Message-----
> From: Lauren Gelman [mailto:gelman@blurryedge.com] 
> Sent: Monday, February 06, 2012 9:18 PM
> To: Roy T. Fielding
> Cc: David Singer; John Simpson; public-tracking@w3.org (public-tracking@w3.org)
> Subject: Re: [Issue-5] [Action-77] Defining Tunnel-Vision 'Do Not (Cross-Site) Track'
> Can you give me an example of a 3rd party site that needs referer info for billing/audit/fraud?  
> Referrer data is used to tell me where a user is coming from.  If I'm Macys and a DNT:1 user arrives on my site because they clicked on an ad on NYT.com then I am a first party.  I get to know referrer info and can credit NYT with the click.
> What is the use case where I'm a third party and I need to know where a user is coming from.  If I'm a Macys ad just sitting on NYT, and a DNT:1 user visits the site, why would referrer info [where the person was prior to arriving at NYT] be passed to me?  If I am an ad server, why do I need that info to do an audit?  They can't sell an ad into that spot based on where the user came from for a DNT:1 user, right?
>> We are already limiting data collection to the site operator
>> and data processors contracted by that site, but "site" in
>> that case includes third-party services.
> I am not sure what this means.  I thought "the site" and "third party services" were distinct entities (however they end up being defined).
> On Feb 2, 2012, at 7:16 PM, Roy T. Fielding wrote:
>> On Feb 2, 2012, at 4:24 PM, Lauren Gelman wrote:
>>> Can you limit the sites who would be required to keep it for audit purposes to only first parties or their service providers?
>> I don't think we can anticipate what sites are required to
>> keep data for auditing purposes, especially since many of
>> the third-party sites are auditors.  Why does it matter,
>> assuming they aren't allowed to share the data or use it
>> operationally (to target or modify responses)?
>> I think it is more effective to place limits on retention
>> in user-identifiable form, since auditors generally do not
>> want to retain the raw data anyway unless it has been detected
>> as likely fraudulent.  Another possibility is to only
>> allow pair-wise retention of referral data, meaning that any
>> user-identifiable data in the record is hashed with something
>> unique to the referring site, or stored separately per site,
>> such that it is difficult to correlate them.  And note that
>> this would only be for sites that *need* to retain this
>> information for billing/auditing/fraud control -- it is not
>> a general exception.
>> We are already limiting data collection to the site operator
>> and data processors contracted by that site, but "site" in
>> that case includes third-party services.  I am assuming that
>> companies like
>> http://www.linkshare.com/
>> are at least capable of siloing data per contract (destination site).
>> I do not know if they do so already.  I doubt that a first party
>> would ever willingly share referral data with anyone else, aside
>> from aggregate forms (like in marketing reports).
>> ....Roy
> Lauren Gelman
> BlurryEdge Strategies
> 415-627-8512
> gelman@blurryedge.com
> http://blurryedge.com

David Singer
Multimedia and Software Standards, Apple Inc.
Received on Thursday, 9 February 2012 21:13:45 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:38:33 UTC