W3C home > Mailing lists > Public > public-tracking@w3.org > February 2012

RE: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

From: Shane Wiley <wileys@yahoo-inc.com>
Date: Sun, 12 Feb 2012 12:17:57 -0800
To: "rob@blaeu.com" <rob@blaeu.com>, Vincent Toubiana <v.toubiana@free.fr>
CC: "public-tracking@w3.org" <public-tracking@w3.org>, "rigo@w3.org" <rigo@w3.org>, JC Cannon <jccannon@microsoft.com>
Message-ID: <63294A1959410048A33AEE161379C8023D0C8ACFF0@SP2-EX07VS02.ds.corp.yahoo.com>

The key phrase is: "In case search engine providers retain personal data longer than 6 months, they will have
to demonstrate comprehensively that it is strictly necessary for the service."

In this case, I believe search engine providers are confident we are able to demonstrate comprehensively that the data retained is strictly necessary for the service (18 months w/ IP Address obfuscation at a shorter timeframe).

- Shane

From: Rob van Eijk [mailto:rob@blaeu.com]
Sent: Sunday, February 12, 2012 12:51 PM
To: Vincent Toubiana
Cc: Shane Wiley; public-tracking@w3.org; rigo@w3.org; JC Cannon
Subject: Re: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

The 6 months needs to be read in the light of the year 2008. Please close read the citation pasted below.
The opinion was and still is directed at a further reduction of the retention period. One of the assumptions
was that limiting retention periods would increase user trust and therefor work as a
competitive advantage for the business. This is where the proportionality principle comes into play.
It is an effective mechanism to balance user and business interests when used wisely.

Citation Opinion 1/2008 on data protection issues related to search engines (WP148):

"If the processing performed by the search engine provider is subject to national
legislation, it must comply both with the privacy standards and with the retention periods
provided for under the legislation of that specific Member State.
If personal data are stored, the retention period should be no longer than necessary for the
specific purposes of the processing. Therefore, after the end of a search session, personal
data could be deleted, and continued storage therefore needs an adequate justification.
However, some search engine companies seem to retain data indefinitely, which is
prohibited. For each purpose, a limited retention time should be defined. Moreover, the
set of personal data to be retained should not be excessive in relation to each purpose.
In practice, the major search engines retain data about their users in personally
identifiable form for over a year (precise terms vary). The Working Party welcomes the
recent reductions in retention periods of personal data by major search engine providers.
However, the fact that leading companies in the field have been able to reduce their
retention periods suggests that the previous terms were longer than necessary.
In view of the initial explanations given by search engine providers on the possible
purposes for collecting personal data, the Working Party does not see a basis for a
retention period beyond 6 months (National legislation may require earlier deletion of personal data).
However, the retention of personal data and the corresponding retention period must
always be justified (with concrete and relevant arguments) and reduced to a minimum, to
improve transparency, to ensure fair processing, and to guarantee proportionality with the
purpose that justifies such retention.
To that effect, the Working Party invites search engine providers to implement the
principle of "privacy by design" which will additionally contribute to further reduce the
retention period. In addition, the Working Party considers that a reduced retention period
will increase users' trust in the service and will thus constitute a significant competitive
In case search engine providers retain personal data longer than 6 months, they will have
to demonstrate comprehensively that it is strictly necessary for the service.
In all cases search engine providers must inform users about the applicable retention
policies for all kinds of user data they process."

On 12-2-2012 19:56, Vincent Toubiana wrote:

A couple of details regarding search log retention.

Actually there is no consensus about the log retention time, even after 18 months Google keeps search logs. They do *pseudonymize* them at two different period of time (9 months and 18 months) but never truly anonymize them. As far as I know, Article 29 has repeatdly asked Yahoo!, Google and MSFT to reduce the retention of personal information (including IP) to 6 months(see http://www.out-law.com/page-11884). Search engines somehow complied by modifying the IP address in their logs after 6 months (only Google keep them for 9 months) but I don't think that actually matches the Article 29 expectation.

Since there is no current consensus on log retention and that Article 29 recommends 6 months, I'd suggest to use this as the standard.


If we're going to use arbitrary time spans for retention, I would recommend that we leverage 18 months as the standard.  This is the time Google, MSFT, and Yahoo! currently use for search logs and have shared this policy with all of the EU DPAs and A29WP.  As the advocates in this working group will likely share the perspective of wanting this to be lower in common with EU DPAs, it's a helpful starting point.  Otherwise we can stop using arbitrary numbers and leverage minimization principles instead - which I personally believe are the better standard to apply to varied business models and can stand the test of time and innovation.

- Shane

-----Original Message-----
From: Rigo Wenning [mailto:rigo@w3.org]
Sent: Thursday, February 09, 2012 9:06 AM
To: public-tracking@w3.org<mailto:public-tracking@w3.org>
Cc: JC Cannon
Subject: Re: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

I concur JC,

On Tuesday 07 February 2012 18:51:27 JC Cannon wrote:

It seems that we are still conflating collection with receipt of logs by a
server and processing of those logs for placement in a profile or

I believe we all agreed that web servers must be able to receive logs in
order for the Internet to work as it does. I would like to propose that the
mere receipt of logs by a web server should not be considered collection or
be constrained by the rules of collection.

However, any processing of the logs should be considered collection and be
governed by our DNT standard.

Inasmuch as the logs will include a DNT signal, any retention policy that
comes out of our standard should apply to those logs.
Whereas 22 of the ePrivacy Directive says:

The prohibition of storage of communications and the related traffic data by
persons other than the users or without their consent is not intended to
prohibit any automatic, intermediate and transient storage of this information
in so far as this takes place for the sole purpose of carrying out the
transmission in the electronic communications network and provided that the
information is not stored for any period longer than is necessary for the
transmission and for traffic management purposes, and that during the period of
storage the confidentiality remains guaranteed. Where this is necessary for
making more efficient the onward transmission of any publicly accessible
information to other recipients of the service upon their request, this
Directive should not prevent such information from being further stored,
provided that this information would in any case be accessible to the public
without restriction and that any data referring to the individual subscribers
or users requesting such information are erased.

As long as we talk about some defaults for retention and logging for the
purpose of carrying out the communication, we shouldn't prevent logging. I
think our task is beyond. We MAY give some hint when we believe those logs are
not necessary anymore.

So while writing logs is collection of data, we may declare normal web logs
out of scope as long as they do not serve to build profiles and as long as they
have some expiry set. (One may be as scared about logs that last forever then
I would be scared about profile creation)

Consequently, a third party that is not in an outsourcing context may not
collect data beyond normal web logs and should anonymize or erase those logs
after 60 Days (just to throw in some arbitrary count) This would be my


Received on Sunday, 12 February 2012 20:18:37 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:38:33 UTC