Re: [ISSUE-5] What is the definition of tracking? from Roy T. Fielding on 2012-03-04 (public-tracking@w3.org from March 2012)

From: Roy T. Fielding <fielding@gbiv.com>
Date: Sun, 4 Mar 2012 15:36:02 -0800
To: Tracking Protection Working Group WG <public-tracking@w3.org>
Message-Id: <3AC6F516-4D0F-409C-A433-506945730D66@gbiv.com>
Color me frustrated.  The definition for tracking provided in the
Compliance document is not distinguishable from any request to a
third-party site while rendering a page, nor does it reflect what
a common user's expectation would be for that term, nor does it
reflect any of the regulatory descriptions of the term.

Here is the current definition:
=========
  3.7 Tracking

  Tracking is the collection or use of user data via either a
  unique identifier or a correlated set of data points being
  used to approximate a unique identifier, in a context other
  than "first party" as defined in this document. This includes:  			

   • a party collecting data across multiple websites,
     even if it is a first party in one or more (but not all)
     of the multiple contexts

   • a third party collecting data on a given website

   • a first party sharing user data collected from a DNT-on
     user with third parties "after the fact".

  Examples of tracking use cases include:

   • personalized advertising
   • cross-site analytics or market research that has not been de-identified
   • automatic preference sharing by social applications

=========

The WG needs a definition that only applies to the act of tracking,
since otherwise the entire Web (every image, CDN, stylesheet, etc.)
is a false positive.  The WG needs a definition that is specific and
consistent with user expectations, since otherwise "allow tracking"
fails as a mechanism for consent.

Here is my proposed replacement text:

=========

Tracking is defined as following or identifying a user, user agent,
or device across multiple visits to a site (time) or across multiple
sites (space).

Mechanisms for performing tracking include but are not limited to:
• assigning a unique identifier to the user, user agent, or device
  such that it will be conveyed back to the server on future visits;
• personalizing references or referral information such that they will
  convey the user, user agent, or device identity to other sites;
• correlating data provided in the request with identifying data
  collected from past requests or obtained from a third party; or,
• combining data provided in the request with de-identified data
  collected or obtained from past requests in order to re-identify
  that data or otherwise associate it with the user, user agent,
  or device.

A preference of "Do Not Track" means that the user does not want
tracking to be engaged for this request, including any mechanism
for performing tracking, any use of data retained from prior tracking,
and any retention or sharing of data from this request for the purpose
of future tracking, beyond what is necessary to enable:
 1) the limited exemptions defined in section XX;
 2) the first-party (and third-parties acting as the first-party)
    to provide the service intentionally requested by the user; and
 3) other services for which the user has provided prior,
    specific, and informed consent.

=========

I believe this new definition of tracking and the corresponding
definition of "Do Not Track" will allow us to move beyond the
arguments over broad exemptions and instead focus on transparency
and individual control.  It allows the user to clearly state that
they don't want tracking outside the first-party context and
don't want any of the data retention/sharing effects of tracking.

The tracking status resource can convey exactly what tracking is
performed by a site, if any, for a given resource and DNT value,
including what limited exemptions are applicable.  Users (through
user agent choice or configuration) can decide what services to use,
or avoid, based on that transparency and not just a single on/off bit.

It separates the act of tracking from the mechanisms for doing
tracking and the kinds of data retained from tracking.  The former
is far easier to define in general, and the latter two will change
over time as technologies change.

It allows a first-party service (including its outsourced
contractors) to perform the service intentionally requested
by the user, which may include personalization, analytics,
or social networking as appropriate for that service, since
otherwise a DNT enabled user would be constantly interrupted
by consent dialogs just to do what they had already requested.
A first-party might change their service upon receipt of DNT,
such as by disabling social networking features, but that is
presumed to be governed by the nature of the first-party
service and the privacy options configured directly with
that first-party.

It also recognizes that the user can provide prior consent
for some services that will override the DNT signal, via
mechanisms outside the scope of this standard, such as
for paid audience survey tracking or content-by-subscription.
Such an override, if active for the user, would be reflected
in the tracking status response.

I would like to see this new text as at least an option in
the upcoming compliance WD.  Also, IMO, the definitions of
user, user agent, device, and tracking should be moved up to
the start of the first section, or the detailed explanation
of things like "first-party" moved into a later section, so
that the details don't overwhelm the purpose of this document.


Cheers,

Roy T. Fielding                     <http://roy.gbiv.com/>
Principal Scientist, Adobe Systems  <http://adobe.com/enterprise>
Received on Sunday, 4 March 2012 23:36:25 UTC