W3C home > Mailing lists > Public > public-tracking@w3.org > October 2012

Re: Third-Party Web Tracking: Policy and Technology Paper outlining harms of tracking

From: Vincent Toubiana <v.toubiana@free.fr>
Date: Thu, 11 Oct 2012 23:25:58 +0200
Message-ID: <50773966.6080500@free.fr>
To: Shane Wiley <wileys@yahoo-inc.com>
CC: Alan Chapell <achapell@chapellassociates.com>, Jeffrey Chester <jeff@democraticmedia.org>, "public-tracking@w3.org" <public-tracking@w3.org>, Jonathan Mayer <jmayer@stanford.edu>
On 10/11/2012 10:36 PM, Shane Wiley wrote:
>
> Again -- please help us understand real-world, tangible harms to 
> consumers from the existence of data attached to pseudo/anonymous 
> identifiers that is not used to directly alter a user's experience (no 
> profiling/targeting).  We've discussed breach concerns and government 
> intrusion but have no documented cases -- are there others?
>
> Thank you,
>
> Shane
>

HI Shane and Alan,

I think I have another documented case: the Youtube vs Viacom where 
Youtube almost had to share pesudonymous data (including IP addresses) 
with Viacom. It is relevant because Youtube may act as a third party 
(i.e. an autostarting video embedded in an iframe, see Viacom claims 
below). Fortunately the two companies reached an agreement, but that was 
very close.
If I remeber correctly, Youtube might have been forced to disclose all 
logs, because IP addresses were (according to Google policy) not 
identifiable information. Unfortunately, Youtube kept a lot of data (12 
terabytes) and a shorter retention period would have certainly helped.

How would that be translated in permitted uses? Ultimately to address 
this issue, there should be no log retention and no permitted uses. More 
realistically, I guess that anything that would result in a shorter 
retention period and data minimization would help in such case. That's 
how I interpret permitted uses: less data kept for a shorter period of 
time so that such data disclosure (while still being an issue) would 
have less dramatic effect. In the long term, these retention period 
should get even shorter if data scientist come up with scalable tools 
that can provide the same functionality with less data.

Please find below some reference to the Viacom vs Youtube (from 
http://www.zdnet.com/blog/btl/youtube-vs-viacom-googles-ip-wins-users-lose/9242).

I hope this answers your question.


Vincent



    Plaintiffs seek all data from the Logging database concerning each
    time a YouTube video has been viewed on the YouTube website or
    through embedding on a third-party website. They need the data to
    compare the attractiveness of allegedly infringing videos with that
    of non-infringing videos. A markedly higher proportion of
    infringing-video watching may bear on plaintiffs' vicarious
    liability claim, and defendants' substantial non-infringing use
    defense. Defendants argue generally that plaintiffs' request is
    unduly burdensome because producing the enormous amount of
    information in the Logging database (about 12 terabytes of data)
    "would be expensive and time-consuming, particularly in light of the
    need to examine the contents for privileged and work product material."

And.

    Defendants argue that the data should not be disclosed because of
    the users' privacy concerns, saying that "Plaintiffs would likely be
    able to determine the viewing and video uploading habits of
    YouTube's users based on the user's login ID and the user's IP
    address". But defendants cite no authority barring them from
    disclosing such information in civil discovery proceedings, and
    their privacy concerns are speculative.
Received on Thursday, 11 October 2012 21:26:26 UTC

This archive was generated by hypermail 2.3.1 : Friday, 21 June 2013 10:11:36 UTC