W3C home > Mailing lists > Public > public-tracking@w3.org > September 2012

RE: ISSUE-5: definition of tracking

From: Shane Wiley <wileys@yahoo-inc.com>
Date: Wed, 5 Sep 2012 09:08:17 -0700
To: David Singer <singer@apple.com>, W3 Tracking <public-tracking@w3.org>
Message-ID: <63294A1959410048A33AEE161379C802620717954C@SP2-EX07VS02.ds.corp.yahoo.com>
David,

I believe you spot light the continued tension between limits on use and limits on collection/retention.  Our proposal moves the conversation to limits on use as the most logical and reasonable approach to real-world implementation.

" I also struggle with the question of log files.  They are, as you say, essentially ubiquitous, but if we mark them as out of scope, then we're basically saying that it's fine to keep the raw ingredients for something problematic, just not OK to make them into that problematic something, and I have a very hard time working out how to define that boundary -- but I think it would be great if we could succeed.  At the moment, we write a 'permission' for raw log files with limits on what you can do with them, which is less elegant but may be easier."

I believe the Permitted Uses structure already addresses raw log files so not sure why this would need to be addressed in the definition of tracking - unless you envision a use that isn't already addressed?

- Shane

-----Original Message-----
From: David Singer [mailto:singer@apple.com] 
Sent: Wednesday, September 05, 2012 8:45 AM
To: W3 Tracking
Subject: Re: ISSUE-5: definition of tracking


On Sep 5, 2012, at 1:29 , Roy T. Fielding <fielding@gbiv.com> wrote:

> On Sep 4, 2012, at 4:16 PM, David Singer wrote:
>> On Sep 4, 2012, at 15:20 , "Roy T. Fielding" <fielding@gbiv.com> wrote:
>> 
>>> On Sep 4, 2012, at 10:07 AM, Aleecia M. McDonald wrote:
>>> 
>>>> 	(c) Buried in this discussion (of "absolutely not tracking") was David Singer's attempt to define tracking: "Tracking is the retention or use, after a transaction is complete, of data records that are, or can be, associated with a single user." (I'd append: ", user agent, or device.")   Unlike every other time someone has made the attempt, the one and only reply was in support. Does that mean we can live with this? [Note that issue-5 is currently raised]
>>> 
>>> Probably not.  It does us very little good to define tracking such 
>>> that it encompasses all access logs, since they are essential to any 
>>> site that isn't deliberately acting as an open gateway.
>>> Are we agreed to that at least?
>> 
>> Actually, I was trying for a definition which clearly *excluded* data that was *out* of scope, and then discussed -- via permissions, and exceptions and so on -- uses that fall into the scope and need discussion.
> 
> Access logs involve the retention of IP addresses, request targets, 
> and other request attributes long after a transaction is complete.
> If keeping an access log is considered tracking, then almost all 
> servers on the Web track (the exceptions being a few privacy-masking portals).

One of the permissions is precisely the keeping of access logs.

> I don't believe that defining tracking such that almost every server 
> on the Web is non-compliant (and will remain non-compliant)

You must be reading a different definition;  nowhere did I write "and those who keep such data are non-compliant". Rather, as I say, the definition is constructed so as to narrow the scope;  if what you are doing falls *outside* this scope (for example, real-time transactional data, or data that is recorded that doesn't associate to a single person, such as cumulative visit counts), then you can stop reading.

> is a viable
> choice if we think deployment of the protocol is desirable, nor do I 
> think it matches user expectations about "do not track", so I'd like 
> to have a definition that matches whatever it is that the user wants 
> us to stop doing when they send DNT:1.

Sure. Agreed. I think it has to include some element of keeping data about people (not just use, it's about collection as well).

>>> A variation on David's definition would be:
>>> 
>>> Tracking is the retention or sharing of data collected from an 
>>> interaction to associate that interaction with a specific user (or 
>>> their personal user agent or device) and use that association to 
>>> obtain, collect, or correlate that user's behavior beyond the scope 
>>> of a single session.
>> 
>> That's not the only (or even possibly primary) use that worries people, in my understanding.
> 
> I am trying to define tracking, not their worries.  If folks can talk 
> about what the above does not cover, then we can look for some wording 
> that plugs the gaps.  Or we can start with any of the four other 
> definitions that I have proposed.  Or some new definition, if someone 
> gets an inspiration.

OK, maybe I am mis-reading what you wrote, as "obtain, collect...user's behavior" reads at best a little oddly.

Interestingly, you drop the hard line I had between data used in the transaction, and data kept after it is complete.  Or is that covered by "the scope of a single session"?  If so, is 'session' reasonably well-defined, or could a site claim that a session starts when you are born and ends when you die?

I also struggle with the question of log files.  They are, as you say, essentially ubiquitous, but if we mark them as out of scope, then we're basically saying that it's fine to keep the raw ingredients for something problematic, just not OK to make them into that problematic something, and I have a very hard time working out how to define that boundary -- but I think it would be great if we could succeed.  At the moment, we write a 'permission' for raw log files with limits on what you can do with them, which is less elegant but may be easier.



David Singer
Multimedia and Software Standards, Apple Inc.
Received on Wednesday, 5 September 2012 16:09:06 UTC

This archive was generated by hypermail 2.3.1 : Friday, 21 June 2013 10:11:33 UTC