W3C home > Mailing lists > Public > public-tracking@w3.org > January 2014

Re: issue-240 - further non normative text to clarify the definition of data collected "across multiple contexts"

From: Roy T. Fielding <fielding@gbiv.com>
Date: Thu, 9 Jan 2014 14:07:15 -0800
Cc: "'Justin Brookman'" <jbrookman@cdt.org>, "David Singer" <singer@apple.com>, <public-tracking@w3.org>
Message-Id: <43D1E075-C85B-4995-9E5C-71EB4FFB27FD@gbiv.com>
To: Mike O'Neill <michael.oneill@baycloud.com>

On Jan 8, 2014, at 5:38 PM, Mike O'Neill wrote:

> As discussed today, here is some non-normative text attempting to clarify the issue of data in one context being “tainted” by information collected in another. This is important because the definition of tracking now leaves out of scope data collected within a single context, i.e. by a data controller responsible for either a  first-party or a third-party resource.

No, it does not leave out any such thing. It says what is tracking,
regardless of how that tracking occurred.

“Tracking is the collection of data regarding a particular user's activity across multiple distinct contexts and the retention, use, or sharing of data derived from that activity outside the context in which it occurred.” 

The problem is how you are misreading the first half and ignoring the second.

  (Tracking) is
  the collection of 
  (data regarding a particular user's activity across multiple distinct contexts)
  the retention, use, or sharing of
  (data derived from that activity)
  outside the context in which it occurred.
Note that the first half doesn't depend on any notion of when that data
was collected, nor by whom.  It doesn't matter how many interactions
might have been collected, nor how they were collected.  As soon as the
data set contains information tied to a particular user's activity in more
than one context, it becomes tracking data, and the act of retaining that
combined data set is tracking because that data set has to be outside the
context of at least one of those multiple distinct contexts.

But that's not how you are reading the sentence.  You are assuming it says

  Tracking is the collection across multiple distinct contexts
  of data regarding a particular user's activity.

Those two sentences are not the same.  The definition isn't ambiguous because
"regarding" always takes precedence.

Whether or not referral data amounts to tracking depends on how it is
processed, what is retained, and for how long.

For example, most shopping sites will associate referral data with a
user for the length of a session in order to measure (and pay a bounty
for) conversions upon sale.  It is fair to say that they are tracking
the user for at least as long as they retain that association tied to
that particular user.  I suspect most compliance regimes would allow
that as a permitted use, but it is still tracking the user until that
association is removed (assuming that the referral data is about some
other context).

In contrast, an analytics product might take the referral data, only
record that a hit was received on page B from site A, and then discard
the remaining bits.  Since the retained form is not data regarding a
particular user, saving the mere count is not tracking the user under
our definition unless the count itself is unique to that user (e.g.,
the referral site is a personal URI).  That's why analytics software
often excludes URI components containing query data, or anything that
looks like UIDs, when retaining or reporting referrals.

To be clear, the definition only describes what tracking is.  It does
not describe what tracking is allowed.

Received on Thursday, 9 January 2014 22:07:40 UTC

This archive was generated by hypermail 2.3.1 : Friday, 3 November 2017 21:45:21 UTC