Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25, ISSUE-31, ISSUE-34, ISSUE-49) from Peter Eckersley on 2012-02-10 (public-tracking@w3.org from February 2012)

From: Peter Eckersley <peter.eckersley@gmail.com>
Date: Fri, 10 Feb 2012 14:17:37 -0800
To: Lauren Gelman <gelman@blurryedge.com>
Cc: Matthias Schunter <mts@zurich.ibm.com>, public-tracking@w3.org
Message-ID: <CAOYJvn+5nKKczmA56Vy4aVvB-D07MR_DSJWeFmOsRzG5wkswsg@mail.gmail.com>
An opt-out cookie should be a low-entropy (non ID, non trackable) cookie,
so it's fine to set that in response to DNT:1.

If someone implemented or specified "opt-out" cookies that were also
tracking cookies, that was an anti-privacy design decision.  The user
should be explicitly opting-back-in to tracking before receiving those
cookies.

An example of this might be a company that wanted "opt out" to mean "opt
out from behavioral targeting", not "opt out from being tracked".  There
may be a quid-quo-pro to be had, of the form: "make yourself easier for us
to track, in exchange for us not showing you a particular kind of
advertising".  That's fine, but that needs to be something a DNT:1 user is
opting-back-in to.

On 10 February 2012 14:07, Lauren Gelman <gelman@blurryedge.com> wrote:

>
> Peter.  How does this relate to Sean's question about whether a site that
> gets a DNT:1 can set an opt-out cookie?
>
> On Feb 10, 2012, at 1:43 PM, Peter Eckersley wrote:
>
> This is an unacceptably large amount of trust to ask users to place in the
> operation of opaque server infrastructure.  DNT is inherently going to
> require some amount of trust in the sense that when servers claim they are
> compliant, users will have to believe them.  However, the most robust
> method for reinforcing this trust is maximizing the scope for auditability:
> when the DNT header is sent, compliant domains delete their ID cookies.  If
> this type of auditing is not possible, then it is inevitable that some of
> the dozens/hundreds of third parties that ultimately implement the
> server-side of the convention will do the opaque server-side privacy part
> wrong, whether by accident, incompetence, or malice.
>
> From EFF's perspective, an exception for ID cookies for administrative
> purposes related to the 3rd party ad delivery would be a non-starter.
>
>
> On 10 February 2012 11:49, Matthias Schunter <mts@zurich.ibm.com> wrote:
>
>> Hi Jonathan,
>>
>>
>> I actually agree with Shane for V01 of our standard.
>>
>> We should focus the MUST on use/retention/sharing/...
>> and we should assume that the actors implementing DNT are
>> not malicious.
>>
>> Nevertheless, a SHOULD can be used to provide incentive for
>> privacy-enhancing technologies. A SHOULD can also state that no more
>> data shall be collected than needed for the operational purpose.
>>
>> Once these technologies are commonplace and the 'legacy' cookie-based
>> technology has lost importance, we can then issue a V02 that upgrades
>> the SHOULD to a MUST (this would at that point not incur large costs).
>>
>> btw: Compared to the starting position of "we continue unchanged and
>> just no longer show targeted ads", this feel like progress to me.
>>
>> Just my personal 2cents; feedback is appreciated.
>>
>>
>> Regards,
>> matthias
>>
>> On 2/10/2012 8:26 PM, Shane Wiley wrote:
>> > I’m open to compromise but need to ensure the outcomes don’t levy
>> > significant cost and loss of revenue to the online advertising
>> > industry in the process (sincerely looking for the appropriate
>> > balance).  I offered that we start at “use-based limitations” for the
>> > MUST (yes, this means we need to trust good actors) and set new
>> > technology approaches as SHOULD.  I believe this is a reasonable
>> > compromise.  Yahoo! (and other industry participants) will immediately
>> > engage with you and others to begin the design process for privacy
>> > enhancing technologies to help bring these solutions to market in a
>> > measured and thoughtful manner – and in a way that all participants
>> > can easily upgrade their current efforts to embrace.  Big picture:
>> > large companies and academia work together to develop the baseline
>> > tech and then provide this as open-source (for example, Apache) to
>> > mid-size and small companies.
>> >
>> >
>> >
>> > Our companies are taking on all the cost and disruption the above
>> > entails – in light of consumer privacy risks that have never been
>> > proven to be real (yet –> understanding the “technically possible”
>> > angle) and are immediately addressing all of the issues surrounding
>> > cross-site profiling data collection and use.
>> >
>> >
>> >
>> > How you do not see this as compromise is difficult for me to understand.
>> >
>> >
>> >
>> > - Shane
>> >
>> >
>> >
>> > *From:*Jonathan Mayer [mailto:jmayer@stanford.edu]
>> > *Sent:* Friday, February 10, 2012 12:01 PM
>> > *To:* Shane Wiley
>> > *Cc:* Justin Brookman; public-tracking@w3.org
>> > *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>> > ISSUE-31, ISSUE-34, ISSUE-49)
>> >
>> >
>> >
>> > Shane,
>> >
>> >
>> >
>> > Your objections in response to this proposal (and earlier discussions
>> > of privacy-preserving technology) suggest that you will not accept
>> > *any* deviation from current data collection practices.  That's not
>> > compromise.
>> >
>> >
>> >
>> > Jonathan
>> >
>> >
>> >
>> > On Feb 10, 2012, at 10:56 AM, Shane Wiley wrote:
>> >
>> >
>> >
>> > Jonathan,
>> >
>> >
>> >
>> > I appreciate and respect the desire to find a technical solution to
>> > online identifiers and identification at a rapid clip.  These concepts
>> > and their related implementations require much deeper thought,
>> > discussion, design, and ultimately consensus.  Your current proposal
>> > (on its surface) would be impossible to achieve at our scale in just 6
>> > months – and would completely halt/disrupt the established product
>> > roadmap for our ad products (which are working hard to be competitive
>> > and keep our systems evolving with the marketplace).  It would
>> > literally take a year or two to go in this direction if we even agreed
>> > this was an appropriate outcome - which I believe it is not at this
>> > time but am more than willing to keep the conversation going in a
>> > different forum.  It’s my opinion that the DNT WG is NOT the
>> > appropriate forum to determine what is appropriate for online
>> > identifiers and identification (much more involved effort than this
>> > isolated conversation).
>> >
>> >
>> >
>> > - Shane
>> >
>> >
>> >
>> > *From:* Jonathan Robert Mayer [mailto:jmayer@stanford.edu]
>> > *Sent:* Friday, February 10, 2012 11:48 AM
>> > *To:* Shane Wiley
>> > *Cc:* Justin Brookman; public-tracking@w3.org
>> > <mailto:public-tracking@w3.org>
>> > *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>> > ISSUE-31, ISSUE-34, ISSUE-49)
>> >
>> >
>> >
>> > Whatever the difficulty of implementation, I understand it won't
>> > happen overnight. How about if we provide a short-term
>> > grandfathering-in period? For example, six months where frequency
>> > capping etc. can still be accomplished with an ID cookie?
>> >
>> >
>> >
>> > Jonathan
>> >
>> > On Feb 10, 2012, at 10:29 AM, Shane Wiley <wileys@yahoo-inc.com
>> > <mailto:wileys@yahoo-inc.com>> wrote:
>> >
>> >     Jonathan,
>> >
>> >
>> >
>> >     Moving an entire architecture that is cookie based to one that is
>> >     IP + User Agent based is not trivial and would require changes at
>> >     all tiers (hosting servers, operational servers, data warehousing
>> >     systems, reporting, security, all scripts and coding logic for
>> >     system interoperability, etc.).  When I quoted the timelines I was
>> >     being serious.  It’s a significant and fundamental change across
>> >     the board.  And while some ad networks may use protocol
>> >     information for “operational uses” they probably also use
>> >     cookies.  So removing cookies from the equation would have
>> >     significant issues for them as well – again, across the board.
>> >
>> >
>> >
>> >     I don’t believe I’m “over estimating” the effort for effect.
>> >
>> >
>> >
>> >     Side Note 1:  I believe there is another Working Group focused on
>> >     Online Identity (perhaps not W3C though – I’ll try to track this
>> >     down).  I mention this as it goes back to my earlier comments on
>> >     not attempting to solve all online privacy issues in a single
>> >     working group.  It’s unfortunate the charter of this working group
>> >     has been so broadly interpreted by some as that appears to be
>> >     where much of the churn is in our efforts.  If our focused was
>> >     constrained to “profiling” and uses of “profiling”, I believe we’d
>> >     be MUCH further along.
>> >
>> >
>> >
>> >     Side Note 2:  I believe the truth of our current situation is
>> >     somewhere between Mike’s email and that our disagreements are
>> >     localized to just a few issues (as you’ve stated).  The
>> >     operational purpose exceptions and implementation cost are so core
>> >     to the discussion (and the on-going ability for many web based
>> >     companies to monetize their efforts) AND appear to be incredibly
>> >     divisive as to render our progress halted at this time (akin to
>> >     “going in circles” versus making incremental steps forward).
>> >     Purely my opinion…
>> >
>> >
>> >
>> >     - Shane
>> >
>> >
>> >
>> >     *From:* Jonathan Mayer [mailto:jmayer@stanford.edu]
>> >     *Sent:* Friday, February 10, 2012 10:46 AM
>> >     *To:* Shane Wiley
>> >     *Cc:* Justin Brookman; public-tracking@w3.org
>> >     <mailto:public-tracking@w3.org>
>> >     *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>> >     ISSUE-31, ISSUE-34, ISSUE-49)
>> >
>> >
>> >
>> >     Shane,
>> >
>> >
>> >
>> >     Could you give a bit more explanation of how this would "require
>> >     massive re-architecture of most internal systems"?  As I
>> >     understand it, some advertising networks already use protocol
>> >     information for "operational uses."  For those companies that
>> >     don't, a quick implementation would be to just hash IP address +
>> >     User-Agent string and treat that as an identifier.  I don't mean
>> >     to excessively trivialize the implementation burden, but it seems
>> >     to me much lesser than other alternatives on the table (save, of
>> >     course, business as usual).
>> >
>> >
>> >
>> >     As for objections to fingerprinting, I want to be clear that the
>> >     idea I'm floating is passive fingerprinting, not active
>> >     fingerprinting.  Passive fingerprinting leverages information that
>> >     we would already allow companies to collect—no more.
>> >
>> >
>> >
>> >     Jonathan
>> >
>> >
>> >
>> >     On Feb 10, 2012, at 9:34 AM, Shane Wiley wrote:
>> >
>> >
>> >
>> >
>> >
>> >     Jonathan,
>> >
>> >
>> >
>> >     I believe this could be a “SHOULD” goal because of two core factors:
>> >
>> >
>> >
>> >     1.       This approach will require massive re-architecture of
>> >     most internal systems (several year effort for a large company –
>> >     months to years for mid-size companies – may be too complex for
>> >     small companies until native platforms come built with this and
>> >     they can upgrade), and
>> >
>> >     2.       There are perhaps larger privacy issues here with the use
>> >     of Digital Fingerprints.  Some advocates (you don’t appear to be
>> >     with them) believe that a cookie is a better tool than a Digital
>> >     Fingerprint as consumers have control of cookies – whereas with a
>> >     Digital Fingerprint they do not (at least not in a simple, native
>> >     tool perspective).  I’m personally on the side of Cookies as I
>> >     believe the control factor and the wealth of automated tools for
>> >     blocking and purging them is a better outcome for consumers than
>> >     are Digital Fingerprints.
>> >
>> >
>> >
>> >     Side Note:  Digital Fingerprints are argued by some vendors to be
>> >     far more effective for tracking due to the lack of consumer
>> >     control and the realities of cookie churn.
>> >
>> >
>> >
>> >     - Shane
>> >
>> >
>> >
>> >     *From:* Jonathan Mayer [mailto:jmayer@stanford.edu]
>> >     *Sent:* Friday, February 10, 2012 10:16 AM
>> >     *To:* Justin Brookman
>> >     *Cc:* public-tracking@w3.org <mailto:public-tracking@w3.org>
>> >     *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>> >     ISSUE-31, ISSUE-34, ISSUE-49)
>> >
>> >
>> >
>> >     Thinking more about tracking through IP address + User-Agent
>> >     string, it occurs to me that the greatest challenges are stability
>> >     over time and across locations.  For some of the "operational
>> >     uses" we have discussed, time- and geography- limited tracking may
>> >     be adequate.  Scoping the "operational use" exceptions to protocol
>> >     data would somewhat accommodate those uses without allowing for
>> >     new data collection, and it would be easier to implement than a
>> >     client-side privacy-preserving technology.  Thoughts on whether
>> >     this is a possible new direction for compromise?
>> >
>> >
>> >
>> >     Jonathan
>> >
>> >
>> >
>> >     On Feb 10, 2012, at 8:30 AM, Jonathan Mayer wrote:
>> >
>> >
>> >
>> >
>> >
>> >
>> >     Justin,
>> >
>> >
>> >
>> >     I think you may be misreading the state of research on tracking
>> >     through IP address + User-Agent string.  There is substantial
>> >     evidence that some browsers can be tracked in that way some of the
>> >     time.  I am not aware of any study that compares the global
>> >     effectiveness of tracking through IP address + User-Agent string
>> >     vs. an ID cookie; intuitively, the ID cookie should be far more
>> >     effective.  The news story you cite glosses over important caveats
>> >     in that paper's methodology; it is certainly not the case that
>> >     "62% of the time, HTTP user-agent information alone can accurately
>> >     tag a host."
>> >
>> >
>> >
>> >     Jonathan
>> >
>> >
>> >
>> >     On Feb 9, 2012, at 6:48 PM, Justin Brookman wrote:
>> >
>> >
>> >
>> >
>> >
>> >
>> >     Sure.  As the spec current reads, third-party ad networks are
>> >     allowed to serve contextual ads on sites even when DNT:1 is on,
>> >     yes?  In order to do this, they're going to get log data, user
>> >     agent string, device info, IP address, referrer url, etc.  There
>> >     is growing recognition that that information in and of itself can
>> >     be used to uniquely identify devices over time
>> >     (
>> http://www.networkworld.com/news/2012/020212-microsoft-anonymous-255667.html
>> )
>> >     for profiling purposes.  It was my understanding that one of the
>> >     primary arguments against allowing third parties to place unique
>> >     identifiers on the client was because of the concern that they
>> >     were going to be secretly tracking and building profiles using
>> >     those cookies.  My point is that they will be able to do that
>> >     regardless, with little external ability to audit.  This system is
>> >     going to rely to some extent on trust unless we are proposing to
>> >     fundamentally rearchitecture the web.
>> >
>> >     The other argument that I've heard against using unique cookies
>> >     for this purpose is valid, though to me less compelling: that even
>> >     if just used for frequency capping, third parties are going to be
>> >     able to amass data about the types of ads a device sees, from
>> >     which you could surmise general information about the sites
>> >     visited on that device (e.g., you are frequency capping a bunch of
>> >     sports ads --> ergo, the operator of that device probably visiting
>> >     sports pages).  Everyone seems to agree that it would be improper
>> >     for a company to use this information to profile (meta-profile?),
>> >     but there are still concerns about data breach, illegitimate
>> >     access, and government access of this potentially revealing
>> >     information.  This concerns me too, but the shadow of my .url
>> >     stream is to me considerably less privacy sensitive than my actual
>> >     .url stream.  I could be willing to compromise on a solution that
>> >     allowed for using cookies for frequency capping, if there was
>> >     agreement on limiting to reasonable campaign length, rules against
>> >     repurposing, and a requirement to make an accountable statement of
>> >     adherence to the standard.  I would be interested to hear if it
>> >     would be feasible to not register frequency caps for ads for
>> >     sensitive categories of information (or if at all, cap
>> >     client-side), though again, it's important to keep in mind that
>> >     that data may well be collected and retained for other excepted
>> >     purposes under the standard (e.g., fraud prevention) --- cookie or
>> >     not.
>> >
>> >     *From:* Jonathan Mayer [mailto:jmayer@stanford.edu]
>> >     *To:* Justin Brookman [mailto:justin@cdt.org]
>> >     *Cc:* public-tracking@w3.org <mailto:public-tracking@w3.org>
>> >     *Sent:* Thu, 09 Feb 2012 18:32:19 -0500
>> >     *Subject:* Re: Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25,
>> >     ISSUE-31, ISSUE-34, ISSUE-49)
>> >
>> >     Justin, could you explain what you mean here?
>> >
>> >     Thanks,
>> >     Jonathan
>> >
>> >     On Feb 9, 2012, at 3:17 PM, Justin Brookman wrote:
>> >
>> >     > the standard currently recognizes that third parties are
>> >     frequently going to be allowed to obtain uniquely-identifying user
>> >     agent strings despite the presence of a DNT:1 header
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>>
>>
>>
>
>
> --
> Peter
>
>
> Lauren Gelman
> BlurryEdge Strategies
> 415-627-8512
> gelman@blurryedge.com
> http://blurryedge.com
>
>


-- 
Peter
Received on Friday, 10 February 2012 22:18:01 UTC