Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25, ISSUE-31, ISSUE-34, ISSUE-49) from Matthias Schunter on 2012-02-10 (public-tracking@w3.org from February 2012)

From: Matthias Schunter <mts@zurich.ibm.com>
Date: Fri, 10 Feb 2012 20:49:09 +0100
To: public-tracking@w3.org
Message-ID: <4F3574B5.3030704@zurich.ibm.com>
Hi Jonathan,


I actually agree with Shane for V01 of our standard.

We should focus the MUST on use/retention/sharing/...
and we should assume that the actors implementing DNT are
not malicious.

Nevertheless, a SHOULD can be used to provide incentive for
privacy-enhancing technologies. A SHOULD can also state that no more
data shall be collected than needed for the operational purpose.

Once these technologies are commonplace and the 'legacy' cookie-based
technology has lost importance, we can then issue a V02 that upgrades
the SHOULD to a MUST (this would at that point not incur large costs).

btw: Compared to the starting position of "we continue unchanged and
just no longer show targeted ads", this feel like progress to me.

Just my personal 2cents; feedback is appreciated.


Regards,
matthias

On 2/10/2012 8:26 PM, Shane Wiley wrote:
> I’m open to compromise but need to ensure the outcomes don’t levy
> significant cost and loss of revenue to the online advertising
> industry in the process (sincerely looking for the appropriate
> balance).  I offered that we start at “use-based limitations” for the
> MUST (yes, this means we need to trust good actors) and set new
> technology approaches as SHOULD.  I believe this is a reasonable
> compromise.  Yahoo! (and other industry participants) will immediately
> engage with you and others to begin the design process for privacy
> enhancing technologies to help bring these solutions to market in a
> measured and thoughtful manner – and in a way that all participants
> can easily upgrade their current efforts to embrace.  Big picture:
> large companies and academia work together to develop the baseline
> tech and then provide this as open-source (for example, Apache) to
> mid-size and small companies.
> 
>  
> 
> Our companies are taking on all the cost and disruption the above
> entails – in light of consumer privacy risks that have never been
> proven to be real (yet –> understanding the “technically possible”
> angle) and are immediately addressing all of the issues surrounding
> cross-site profiling data collection and use. 
> 
>  
> 
> How you do not see this as compromise is difficult for me to understand.
> 
>  
> 
> - Shane
> 
>  
> 
> *From:*Jonathan Mayer [mailto:jmayer@stanford.edu]
> *Sent:* Friday, February 10, 2012 12:01 PM
> *To:* Shane Wiley
> *Cc:* Justin Brookman; public-tracking@w3.org
> *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
> ISSUE-31, ISSUE-34, ISSUE-49)
> 
>  
> 
> Shane,
> 
>  
> 
> Your objections in response to this proposal (and earlier discussions
> of privacy-preserving technology) suggest that you will not accept
> *any* deviation from current data collection practices.  That's not
> compromise.
> 
>  
> 
> Jonathan
> 
>  
> 
> On Feb 10, 2012, at 10:56 AM, Shane Wiley wrote:
> 
> 
> 
> Jonathan,
> 
>  
> 
> I appreciate and respect the desire to find a technical solution to
> online identifiers and identification at a rapid clip.  These concepts
> and their related implementations require much deeper thought,
> discussion, design, and ultimately consensus.  Your current proposal
> (on its surface) would be impossible to achieve at our scale in just 6
> months – and would completely halt/disrupt the established product
> roadmap for our ad products (which are working hard to be competitive
> and keep our systems evolving with the marketplace).  It would
> literally take a year or two to go in this direction if we even agreed
> this was an appropriate outcome - which I believe it is not at this
> time but am more than willing to keep the conversation going in a
> different forum.  It’s my opinion that the DNT WG is NOT the
> appropriate forum to determine what is appropriate for online
> identifiers and identification (much more involved effort than this
> isolated conversation).
> 
>  
> 
> - Shane
> 
>  
> 
> *From:* Jonathan Robert Mayer [mailto:jmayer@stanford.edu] 
> *Sent:* Friday, February 10, 2012 11:48 AM
> *To:* Shane Wiley
> *Cc:* Justin Brookman; public-tracking@w3.org
> <mailto:public-tracking@w3.org>
> *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
> ISSUE-31, ISSUE-34, ISSUE-49)
> 
>  
> 
> Whatever the difficulty of implementation, I understand it won't
> happen overnight. How about if we provide a short-term
> grandfathering-in period? For example, six months where frequency
> capping etc. can still be accomplished with an ID cookie?
> 
>  
> 
> Jonathan
> 
> On Feb 10, 2012, at 10:29 AM, Shane Wiley <wileys@yahoo-inc.com
> <mailto:wileys@yahoo-inc.com>> wrote:
> 
>     Jonathan,
> 
>      
> 
>     Moving an entire architecture that is cookie based to one that is
>     IP + User Agent based is not trivial and would require changes at
>     all tiers (hosting servers, operational servers, data warehousing
>     systems, reporting, security, all scripts and coding logic for
>     system interoperability, etc.).  When I quoted the timelines I was
>     being serious.  It’s a significant and fundamental change across
>     the board.  And while some ad networks may use protocol
>     information for “operational uses” they probably also use
>     cookies.  So removing cookies from the equation would have
>     significant issues for them as well – again, across the board.
> 
>      
> 
>     I don’t believe I’m “over estimating” the effort for effect. 
> 
>      
> 
>     Side Note 1:  I believe there is another Working Group focused on
>     Online Identity (perhaps not W3C though – I’ll try to track this
>     down).  I mention this as it goes back to my earlier comments on
>     not attempting to solve all online privacy issues in a single
>     working group.  It’s unfortunate the charter of this working group
>     has been so broadly interpreted by some as that appears to be
>     where much of the churn is in our efforts.  If our focused was
>     constrained to “profiling” and uses of “profiling”, I believe we’d
>     be MUCH further along.
> 
>      
> 
>     Side Note 2:  I believe the truth of our current situation is
>     somewhere between Mike’s email and that our disagreements are
>     localized to just a few issues (as you’ve stated).  The
>     operational purpose exceptions and implementation cost are so core
>     to the discussion (and the on-going ability for many web based
>     companies to monetize their efforts) AND appear to be incredibly
>     divisive as to render our progress halted at this time (akin to
>     “going in circles” versus making incremental steps forward). 
>     Purely my opinion…
> 
>      
> 
>     - Shane
> 
>      
> 
>     *From:* Jonathan Mayer [mailto:jmayer@stanford.edu] 
>     *Sent:* Friday, February 10, 2012 10:46 AM
>     *To:* Shane Wiley
>     *Cc:* Justin Brookman; public-tracking@w3.org
>     <mailto:public-tracking@w3.org>
>     *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>     ISSUE-31, ISSUE-34, ISSUE-49)
> 
>      
> 
>     Shane,
> 
>      
> 
>     Could you give a bit more explanation of how this would "require
>     massive re-architecture of most internal systems"?  As I
>     understand it, some advertising networks already use protocol
>     information for "operational uses."  For those companies that
>     don't, a quick implementation would be to just hash IP address +
>     User-Agent string and treat that as an identifier.  I don't mean
>     to excessively trivialize the implementation burden, but it seems
>     to me much lesser than other alternatives on the table (save, of
>     course, business as usual).
> 
>      
> 
>     As for objections to fingerprinting, I want to be clear that the
>     idea I'm floating is passive fingerprinting, not active
>     fingerprinting.  Passive fingerprinting leverages information that
>     we would already allow companies to collect—no more.
> 
>      
> 
>     Jonathan
> 
>      
> 
>     On Feb 10, 2012, at 9:34 AM, Shane Wiley wrote:
> 
> 
> 
> 
> 
>     Jonathan,
> 
>      
> 
>     I believe this could be a “SHOULD” goal because of two core factors:
> 
>      
> 
>     1.       This approach will require massive re-architecture of
>     most internal systems (several year effort for a large company –
>     months to years for mid-size companies – may be too complex for
>     small companies until native platforms come built with this and
>     they can upgrade), and
> 
>     2.       There are perhaps larger privacy issues here with the use
>     of Digital Fingerprints.  Some advocates (you don’t appear to be
>     with them) believe that a cookie is a better tool than a Digital
>     Fingerprint as consumers have control of cookies – whereas with a
>     Digital Fingerprint they do not (at least not in a simple, native
>     tool perspective).  I’m personally on the side of Cookies as I
>     believe the control factor and the wealth of automated tools for
>     blocking and purging them is a better outcome for consumers than
>     are Digital Fingerprints.
> 
>      
> 
>     Side Note:  Digital Fingerprints are argued by some vendors to be
>     far more effective for tracking due to the lack of consumer
>     control and the realities of cookie churn.
> 
>      
> 
>     - Shane
> 
>      
> 
>     *From:* Jonathan Mayer [mailto:jmayer@stanford.edu] 
>     *Sent:* Friday, February 10, 2012 10:16 AM
>     *To:* Justin Brookman
>     *Cc:* public-tracking@w3.org <mailto:public-tracking@w3.org>
>     *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>     ISSUE-31, ISSUE-34, ISSUE-49)
> 
>      
> 
>     Thinking more about tracking through IP address + User-Agent
>     string, it occurs to me that the greatest challenges are stability
>     over time and across locations.  For some of the "operational
>     uses" we have discussed, time- and geography- limited tracking may
>     be adequate.  Scoping the "operational use" exceptions to protocol
>     data would somewhat accommodate those uses without allowing for
>     new data collection, and it would be easier to implement than a
>     client-side privacy-preserving technology.  Thoughts on whether
>     this is a possible new direction for compromise?
> 
>      
> 
>     Jonathan
> 
>      
> 
>     On Feb 10, 2012, at 8:30 AM, Jonathan Mayer wrote:
> 
> 
> 
> 
> 
> 
>     Justin,
> 
>      
> 
>     I think you may be misreading the state of research on tracking
>     through IP address + User-Agent string.  There is substantial
>     evidence that some browsers can be tracked in that way some of the
>     time.  I am not aware of any study that compares the global
>     effectiveness of tracking through IP address + User-Agent string
>     vs. an ID cookie; intuitively, the ID cookie should be far more
>     effective.  The news story you cite glosses over important caveats
>     in that paper's methodology; it is certainly not the case that
>     "62% of the time, HTTP user-agent information alone can accurately
>     tag a host."
> 
>      
> 
>     Jonathan
> 
>      
> 
>     On Feb 9, 2012, at 6:48 PM, Justin Brookman wrote:
> 
> 
> 
> 
> 
> 
>     Sure.  As the spec current reads, third-party ad networks are
>     allowed to serve contextual ads on sites even when DNT:1 is on,
>     yes?  In order to do this, they're going to get log data, user
>     agent string, device info, IP address, referrer url, etc.  There
>     is growing recognition that that information in and of itself can
>     be used to uniquely identify devices over time
>     (http://www.networkworld.com/news/2012/020212-microsoft-anonymous-255667.html)
>     for profiling purposes.  It was my understanding that one of the
>     primary arguments against allowing third parties to place unique
>     identifiers on the client was because of the concern that they
>     were going to be secretly tracking and building profiles using
>     those cookies.  My point is that they will be able to do that
>     regardless, with little external ability to audit.  This system is
>     going to rely to some extent on trust unless we are proposing to
>     fundamentally rearchitecture the web.
> 
>     The other argument that I've heard against using unique cookies
>     for this purpose is valid, though to me less compelling: that even
>     if just used for frequency capping, third parties are going to be
>     able to amass data about the types of ads a device sees, from
>     which you could surmise general information about the sites
>     visited on that device (e.g., you are frequency capping a bunch of
>     sports ads --> ergo, the operator of that device probably visiting
>     sports pages).  Everyone seems to agree that it would be improper
>     for a company to use this information to profile (meta-profile?),
>     but there are still concerns about data breach, illegitimate
>     access, and government access of this potentially revealing
>     information.  This concerns me too, but the shadow of my .url
>     stream is to me considerably less privacy sensitive than my actual
>     .url stream.  I could be willing to compromise on a solution that
>     allowed for using cookies for frequency capping, if there was
>     agreement on limiting to reasonable campaign length, rules against
>     repurposing, and a requirement to make an accountable statement of
>     adherence to the standard.  I would be interested to hear if it
>     would be feasible to not register frequency caps for ads for
>     sensitive categories of information (or if at all, cap
>     client-side), though again, it's important to keep in mind that
>     that data may well be collected and retained for other excepted
>     purposes under the standard (e.g., fraud prevention) --- cookie or
>     not.  
> 
>     *From:* Jonathan Mayer [mailto:jmayer@stanford.edu]
>     *To:* Justin Brookman [mailto:justin@cdt.org]
>     *Cc:* public-tracking@w3.org <mailto:public-tracking@w3.org>
>     *Sent:* Thu, 09 Feb 2012 18:32:19 -0500
>     *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>     ISSUE-31, ISSUE-34, ISSUE-49)
> 
>     Justin, could you explain what you mean here?
> 
>     Thanks,
>     Jonathan
> 
>     On Feb 9, 2012, at 3:17 PM, Justin Brookman wrote:
> 
>     > the standard currently recognizes that third parties are
>     frequently going to be allowed to obtain uniquely-identifying user
>     agent strings despite the presence of a DNT:1 header
> 
>      
> 
>      
> 
>  
>
Received on Friday, 10 February 2012 19:49:41 UTC