Re: ISSUE-5: Consensus definition of "tracking" for the intro?

Hi Mike, everyone,


now I think I understand Roy's problem with the definition that was floating around for a year or two, I think we have a simpler fix.

Roy's complaint in <http://lists.w3.org/Archives/Public/public-tracking/2013Oct/0219.html> is that in this definition

   Tracking is the retention or use, after a network interaction is complete, of data that are, or can be, associated with a specific user, user agent, or device.

or similar ones, essentially *first-party* data collection is within the definition of tracking.  Now, we do indeed put limits on first-parties in the text of the document on what they can do with tracking data (they can't share tracking data with third parties who wouldn't have been allowed to get it for themselves, for example), for formally this is right (we shouldn't be expressing limits on data we don't have in scope).  So, I think that we're better off addressing Roy's confusion by inserting the first-party exception into the definition (even though, as I say, there are nonetheless rules for the first party on what it does with the tracking data it gathers):

    "Tracking is the retention or use by a site outside the first party, after a network transaction is complete, of data that is, or can be, associated with a specific user, user agent, or device."

[I'm sorry, data is for me a mass, singular, noun.]

Again, I am relying on the (correct) definition of a network transaction being an HTTP request and its response, or the equivalent in other protocols.


The other feature of the alternative definitions that I attempted to address before is the phrase 'multiple parties', or similar.  But my attempt both a while ago, and recently, to get clarity of what's meant here is not working.  If it means associating the user with another party (other than the one doing the retaining), then all sorts of data suddenly are not tracking:
* remembering what the user is interested in, based on where they go (not remembering where they go)
* remembering other data that can be collected or inferred from where they go (e.g. that this is a site only visited by members of a given profession, or by adults, or...)
* remembering other data that can be collected or inferred from the transaction (e.g. IP address -> geographic location -> time of day in that location)

I think it leaves far too much data on the table, not considered 'tracking'.




I think this definition is clear enough, provides a 'litmus test' for us, for users, and for sites ('is X tracking Y if they do Z?'), is close to what Rob wrote, and (with the exception of the confusing 'multiple parties') close to the other offerings also.



I hope we can reach consensus on this definition and move ahead.  It's surely in accord with our 'gut feeling' that tracking is the recording of information about us by actors we don't choose to interact with and for the most part are not even aware of.


    "Tracking is the retention or use by a site outside the first party, after a network transaction is complete, of data that is, or can be, associated with a specific user, user agent, or device."



On Oct 15, 2013, at 5:08 , Mike O'Neill <michael.oneill@baycloud.com> wrote:

> In an off-list discussion about the compromise proposal David flagged that it could be read to allow third-party retention of data derived from the act of visiting a first-party site, e.g. what category of site, time of visit etc. So a server receiving a third-party request could retain data about the first-party resource, say a Guns & Ammo site and when it was visited, and associate that with a unique user identifier to build up an activity history.
>  
> I have tried to come up with a definition that encompasses the first-party alleviation which does not end of over-complicating what should be a simple definition.
>  
> I now have this, though I still don’t think it is as clear as the current draft or option 4 (which I prefer) with the party qualification laid out in the compliance section.
>  
> A transport request is any request for content delivery over the public internet.
> Transport data is data in, or derived from, a given transport request.
> Tracking data is a subset of transport data which can be used to recognise subsequent transport requests to be from the same user,  user agent, or device.
> Tracking is the sharing of tracking data with a third-party, or its retention by a party other than the first-party.
>  
> Mike
>  
>  

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Tuesday, 15 October 2013 21:31:13 UTC