W3C home > Mailing lists > Public > public-tracking@w3.org > January 2012

Re: meaning of DNT 1 and DNT 0 when sent by user agents [ISSUE-78]

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Wed, 18 Jan 2012 04:56:46 +0100
To: David Wainberg <dwainberg@appnexus.com>
Cc: David Singer <singer@apple.com>, "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-ID: <s44ch7dv1h3e1i3800981pg0drk1jqlh0j@hive.bjoern.hoehrmann.de>
* David Wainberg wrote:
>One might then think that the 1st/3rd party distinction and "cross-site" 
>are equivalent. But I would argue they're not, for at least the 
>following. First, defining cross-site tracking is closer to the problem 
>we're trying to solve, and that's generally a good thing. By tailoring 
>our definitions to the actual problems we are trying to solve, we reduce 
>the risk of being overinclusive, creating ambiguity, or creating 
>unintended consequences.

There are two ideas here. One is that rules for very large systems often
do not work well when they are applied to very small systems. There are
rules for how we understand planets to move, but they do not work well
to explain the movement of electrons. For artworks we have rules related
to making copies and creating derivative works that we understand well
in the large, but applying to the internal workings of a computer does
not work well (when you play back a music stream and your computer loads
the whole stream into memory, then swaps that to disk, and then loads it
into memory again, have copyright-relevant copies been made?) More rele-
vant to the group would be storage; if there was a rule "must not store"
then one might raise the issue that computers would typically store and
retain information about recent data flows in transient memory and if
that is storage.

The other idea is "chinese whispers", signals degrade over distance, so
there is a proximity measure. There are two dimensions, one is proximi-
ty to "individuals" (you are, someone in your household is, someone in
you neighbourhood or social circle is, there exist people who are) and
the other is proximity to disclosure and observation contexts (I tell
you, you tell friends, they tell their friends, one of them tells the
media; data is retained for seconds, weeks, decades). Revealing signals
should degrade quickly as they travel beyond the disclosure context. I
tell you about me, you tell your friends about "a friend", they tell
others about "someone". Someone saying "do not track me" wants signals
to degrade more quickly than someone saying "track me, please".

So one way I look at it is that the Working Group does not want to say
how the memory manager in the operating system a front-end caching proxy
that serves some content to a user is supposed to work, and it does not
want to say whether a search engine that wants to show users the things
they most recently searched for can store this information on the server
or has to store it client-side, or if the New York Times can store all
bits of the IP address someone in Germany used to request an article by
the New York Times served from servers the New York Times controls when
the details of such transactions stay with the New York Times Company at
all times, for reasons like that it would be complicated and take a lot
of time with probably bad chances for consensus, but there is a point
where the Working Group wants to fixate expected behavior for dnt com-
pliance.

The two-dimensional distance measure in this model is very complicated,
the Working Group could not adopt a data-based approach and say, as an
example, for data indicative of sexual preferences This, but for data
indicative of personal finances That, as societies differ a lot on how
they treat such information; at least not in a consensus-based setting.
Similarily, it can't say that moving data between YouTube, LLC and
Google, Inc. is, say, twice as much distance as moving data between
"Flickr" and Yahoo!, Inc.

But the Working Group can say that a planet is bigger than an electron
even when it is not clear whether a proton is bigger than an electron.
The distinction between 1st party and 3rd party is like that, and so is
a distinction between cross-site and same-site. If you assume that users
have some intuitive notion of how much is too much ("I visit many sites
that discuss gardening and get many ads for gardening tools, and that is
okay, but I also visit many sites that advocate that Pluto is not a
Planet and am denied entrance to countries that consider Pluto a Planet
and that is not okay") then it becomes very unclear how cross-party and
cross-site are different concepts. Signals ought to degrade over dis-
tance, 1st versus 3rd party is a great distance, cross-site versus same-
site is a great distance, and in the small distance is difficult to
measure, so, it seems difficult to tell the two apart.

>Some of the benefits:
>- Relies simply on a clear definition of the data collection and use 
>practices DNT is concerned with, rather than a multi-step process of 
>determining party status and then covered collection and use.
>- Removes the step of determining 1st vs 3rd party status in any given 
>circumstance, and then possibly having separate compliance paths for each.
>- Saves us from defining 1st vs 3rd parties, and thus eliminates having 
>to deal with edge cases like widgets and URL shorteners.
>- Solves the 3rd party as agent problem: if it's not cross-site, it's 
>not covered.

I think the Working Group needs to consider what the "dnt signal" should
mean for dnt-compliant "widgets" or "redirection services" and then come
up with suitable definitions and requirements. I do not understand how
some definition or other would eliminate the need to figure out how such
a situation should be considered from a dnt-compliance perspective. The
Working Group needs to "deal with" this kind of thing, regardless of how
it defines anything, if it means to define dnt-compliance in such con-
texts.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Wednesday, 18 January 2012 03:57:07 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:38:30 UTC