W3C home > Mailing lists > Public > public-tracking@w3.org > February 2012

RE: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

From: Shane Wiley <wileys@yahoo-inc.com>
Date: Sun, 12 Feb 2012 13:22:58 -0800
To: "rob@blaeu.com" <rob@blaeu.com>
CC: Vincent Toubiana <v.toubiana@free.fr>, "public-tracking@w3.org" <public-tracking@w3.org>, "rigo@w3.org" <rigo@w3.org>, JC Cannon <jccannon@microsoft.com>
Message-ID: <63294A1959410048A33AEE161379C8023D0C8ACFF8@SP2-EX07VS02.ds.corp.yahoo.com>
Thank you Rob.  I believe this is a fair example of the "minimization standard" in practice.

As for implementation period, this took most in industry over a year to implement (specifically 18 months for Yahoo!).  If we're looking at significant re-architecture elements until a company can say they are DNT compliant, even if a company of any scale wanted to support DNT, they are looking at a meaningful implementation effort in the order of over a full year (perhaps longer).  Smaller companies will be more challenged to manage DNT traffic separate from other data practices and therefore will be more likely to not implement DNT.

As I've stated to some in the past, I believe the best approach is to isolate profiling activities (often managed by separate systems so easier to block or redirect traffic around these systems) in the initial version of DNT.  This will allow massive adoption of DNT quickly.  In parallel, as this foundation of implementation is being established, industry and academia can be working on the next version of DNT that focuses on privacy enhancing technologies and tools for rapid, business supportive anonymization methods to be deployed in open source efforts (such as Apache).

If one end of the spectrum is releasing a version of DNT that is overly prescriptive on data collection with few, if any, use exceptions that very few companies, if any, will implement - AND the other end of the spectrum is industry continuing with opt-out mechanism and supportive technical tools such as browser opt-out managers - then this feels like the appropriate compromise and will allow for rapid DNT adoption.

- Shane

From: Rob van Eijk [mailto:rob@blaeu.com]
Sent: Sunday, February 12, 2012 1:47 PM
To: Shane Wiley
Cc: Vincent Toubiana; public-tracking@w3.org; rigo@w3.org; JC Cannon
Subject: Re: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

And that is why I underlined that key part of the sentence.

The lesson learned from the dialogue in 2008 is how we can take this to the topic at hand in this workgroup, which is
the discussion "Deciding Exceptions (ISSUE-23, ISSUE-24, ISSUE-25, ISSUE-31, ISSUE-34, ISSUE-49)"

Rob

On 12-2-2012 21:17, Shane Wiley wrote:
Rob,

The key phrase is: "In case search engine providers retain personal data longer than 6 months, they will have
to demonstrate comprehensively that it is strictly necessary for the service."

In this case, I believe search engine providers are confident we are able to demonstrate comprehensively that the data retained is strictly necessary for the service (18 months w/ IP Address obfuscation at a shorter timeframe).

- Shane

From: Rob van Eijk [mailto:rob@blaeu.com]
Sent: Sunday, February 12, 2012 12:51 PM
To: Vincent Toubiana
Cc: Shane Wiley; public-tracking@w3.org<mailto:public-tracking@w3.org>; rigo@w3.org<mailto:rigo@w3.org>; JC Cannon
Subject: Re: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

The 6 months needs to be read in the light of the year 2008. Please close read the citation pasted below.
The opinion was and still is directed at a further reduction of the retention period. One of the assumptions
was that limiting retention periods would increase user trust and therefor work as a
competitive advantage for the business. This is where the proportionality principle comes into play.
It is an effective mechanism to balance user and business interests when used wisely.

Citation Opinion 1/2008 on data protection issues related to search engines (WP148):

"If the processing performed by the search engine provider is subject to national
legislation, it must comply both with the privacy standards and with the retention periods
provided for under the legislation of that specific Member State.
If personal data are stored, the retention period should be no longer than necessary for the
specific purposes of the processing. Therefore, after the end of a search session, personal
data could be deleted, and continued storage therefore needs an adequate justification.
However, some search engine companies seem to retain data indefinitely, which is
prohibited. For each purpose, a limited retention time should be defined. Moreover, the
set of personal data to be retained should not be excessive in relation to each purpose.
In practice, the major search engines retain data about their users in personally
identifiable form for over a year (precise terms vary). The Working Party welcomes the
recent reductions in retention periods of personal data by major search engine providers.
However, the fact that leading companies in the field have been able to reduce their
retention periods suggests that the previous terms were longer than necessary.
In view of the initial explanations given by search engine providers on the possible
purposes for collecting personal data, the Working Party does not see a basis for a
retention period beyond 6 months (National legislation may require earlier deletion of personal data).
However, the retention of personal data and the corresponding retention period must
always be justified (with concrete and relevant arguments) and reduced to a minimum, to
improve transparency, to ensure fair processing, and to guarantee proportionality with the
purpose that justifies such retention.
To that effect, the Working Party invites search engine providers to implement the
principle of "privacy by design" which will additionally contribute to further reduce the
retention period. In addition, the Working Party considers that a reduced retention period
will increase users' trust in the service and will thus constitute a significant competitive
advantage.
In case search engine providers retain personal data longer than 6 months, they will have
to demonstrate comprehensively that it is strictly necessary for the service.
In all cases search engine providers must inform users about the applicable retention
policies for all kinds of user data they process."

On 12-2-2012 19:56, Vincent Toubiana wrote:
Shane,

A couple of details regarding search log retention.

Actually there is no consensus about the log retention time, even after 18 months Google keeps search logs. They do *pseudonymize* them at two different period of time (9 months and 18 months) but never truly anonymize them. As far as I know, Article 29 has repeatdly asked Yahoo!, Google and MSFT to reduce the retention of personal information (including IP) to 6 months(see http://www.out-law.com/page-11884). Search engines somehow complied by modifying the IP address in their logs after 6 months (only Google keep them for 9 months) but I don't think that actually matches the Article 29 expectation.

Since there is no current consensus on log retention and that Article 29 recommends 6 months, I'd suggest to use this as the standard.

Vincent




If we're going to use arbitrary time spans for retention, I would recommend that we leverage 18 months as the standard.  This is the time Google, MSFT, and Yahoo! currently use for search logs and have shared this policy with all of the EU DPAs and A29WP.  As the advocates in this working group will likely share the perspective of wanting this to be lower in common with EU DPAs, it's a helpful starting point.  Otherwise we can stop using arbitrary numbers and leverage minimization principles instead - which I personally believe are the better standard to apply to varied business models and can stand the test of time and innovation.

- Shane

-----Original Message-----
From: Rigo Wenning [mailto:rigo@w3.org]
Sent: Thursday, February 09, 2012 9:06 AM
To: public-tracking@w3.org<mailto:public-tracking@w3.org>
Cc: JC Cannon
Subject: Re: ACTION-75: Write-up a hybrid of Do Not Profile and Do Not Cross-Site Track

I concur JC,

On Tuesday 07 February 2012 18:51:27 JC Cannon wrote:


It seems that we are still conflating collection with receipt of logs by a
server and processing of those logs for placement in a profile or
otherwise.

I believe we all agreed that web servers must be able to receive logs in
order for the Internet to work as it does. I would like to propose that the
mere receipt of logs by a web server should not be considered collection or
be constrained by the rules of collection.

However, any processing of the logs should be considered collection and be
governed by our DNT standard.

Inasmuch as the logs will include a DNT signal, any retention policy that
comes out of our standard should apply to those logs.
Whereas 22 of the ePrivacy Directive says:

The prohibition of storage of communications and the related traffic data by
persons other than the users or without their consent is not intended to
prohibit any automatic, intermediate and transient storage of this information
in so far as this takes place for the sole purpose of carrying out the
transmission in the electronic communications network and provided that the
information is not stored for any period longer than is necessary for the
transmission and for traffic management purposes, and that during the period of
storage the confidentiality remains guaranteed. Where this is necessary for
making more efficient the onward transmission of any publicly accessible
information to other recipients of the service upon their request, this
Directive should not prevent such information from being further stored,
provided that this information would in any case be accessible to the public
without restriction and that any data referring to the individual subscribers
or users requesting such information are erased.

As long as we talk about some defaults for retention and logging for the
purpose of carrying out the communication, we shouldn't prevent logging. I
think our task is beyond. We MAY give some hint when we believe those logs are
not necessary anymore.

So while writing logs is collection of data, we may declare normal web logs
out of scope as long as they do not serve to build profiles and as long as they
have some expiry set. (One may be as scared about logs that last forever then
I would be scared about profile creation)

Consequently, a third party that is not in an outsourcing context may not
collect data beyond normal web logs and should anonymize or erase those logs
after 60 Days (just to throw in some arbitrary count) This would be my
suggestion.

Best,

Rigo
Received on Sunday, 12 February 2012 21:23:45 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:44:45 UTC