Re: ISSUE-16, ACTION-166: define (data) collection from Roy T. Fielding on 2012-05-23 (public-tracking@w3.org from May 2012)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Wed, 23 May 2012 16:33:41 -0700
To: Sean Harvey <sharvey@google.com>
Cc: "public-tracking@w3.org Group WG" <public-tracking@w3.org>
Message-Id: <E144A664-2489-4AFF-83BB-58A52B111A12@gbiv.com>

On May 23, 2012, at 1:57 PM, Sean Harvey wrote:

> Thanks Roy I really appreciate your putting this together. Prior to this we had been working under an arbitrary nomenclature division that was idiosyncratic to this working group and its documents, with "collection" equaling touching the web server (unavoidable in all cases) and "retention" was a term more in line with your definition of "collection".
>
> I'm curious to understand what you view as the real world implications of this. Why in your view is it important that we define collection in this more traditional fashion? What problems could crop up if we kept the current definitions & nomenclature? Is this about misinterpretation by regulatory bodies? Are there other potential issues?

My primary concern is implementers misunderstanding what we
have required, implementing according to the common definitions,
and then being accused of noncompliance with the fine print
(or sued by opportunistic lawyers).

My secondary concern is that we have very long discussions in
this working group, all the time, where one group thinks we
have agreement on constraining receipt of data and another
group thinks we have agreement on constraining retention of data.
Hence, we have no consensus even when we appear to agree.

As an editor, I think it is embarrassing to tell readers that
we are using a common term in a way different than it is commonly
understood. It is particularly confusing when major public
initiatives, like the WH privacy bill of rights, uses the
"data collection" term repeatedly in the way I described and
then refers to our standards process as addressing that problem.

Finally, there are many techniques we can use in industry to
improve privacy for users while using unique identifiers for
fraud control, frequency capping, and shopping baskets.
If we pretend to forbid them entirely, then we lose the ability
to make precise and significant improvements to privacy by
constraining how they are retained, shared, or used.

In particular, I already outlined a solution to cross-site
frequency capping that uses the same technology as existing
frequency capping mechanisms (hence, proven reliable and
scalable) but does not retain the cookie identifier in a
form that could be reused to link user activity across sites
and/or campaigns. I am sure there are other user experience
problems with DNT:1 that can be solved in a similar way.

I do not consider this a silver bullet, by any means.
While these techniques might improve compliance with EU data
protection laws, they probably won't satisfy the ePrivacy Directive,
unless the official interpretations change substantially,
since that directive isn't actually directed at data collection
(except by side effect while protecting the integrity of a
user's personal equipment). For example,

http://www.ico.gov.uk/for_organisations/privacy_and_electronic_communications/the_guide/cookies.aspx

http://translate.google.com/translate?sl=fr&tl=en&js=n&prev=_t&hl=en&ie=UTF-8&layout=2&eotf=1&u=http%3A%2F%2Fwww.cnil.fr%2Fen-savoir-plus%2Ffiches-pratiques%2Ffiche%2Farticle%2Fce-que-le-paquet-telecom-change-pour-les-cookies%2F

....Roy

Received on Wednesday, 23 May 2012 23:34:08 UTC