W3C home > Mailing lists > Public > public-tracking@w3.org > May 2012

Re: ISSUE-16, ACTION-166: define (data) collection

From: Bjoern Hoehrmann <derhoermi@gmx.net>
Date: Fri, 25 May 2012 03:57:42 +0200
To: "Roy T. Fielding" <fielding@gbiv.com>
Cc: "public-tracking@w3.org Group WG" <public-tracking@w3.org>
Message-ID: <gljtr7p2vnojmclkchcov2u69pt4ao6623@hive.bjoern.hoehrmann.de>
* Roy T. Fielding wrote:
>On May 23, 2012, at 3:40 PM, Bjoern Hoehrmann wrote:
>> When you visit a web page and a script on it determines the resolution
>> of your screen or your timezone or whatever, and sends that information
>> to some server, I would say someone or something is collecting that in-
>> formation,
>
>Yes, the script is collecting it from the scripting environment
>and then passing the data to another outside that environment.

There is no retention in the example above. I see taking the timezone
information the same as taking a photograph, and sending that on to a
server the same as making the browser send a cookie, but I understood
that there needs to be some form of "retention" for "collection". I'd
read your proposal

   "Data collection" (for the purpose of DNT) is the process of
    assembling data from or about one or more network interactions
    and retaining/sharing that data beyond the scope of responding
    to the current request or in a form that remains linkable to a
    specific user, user agent, or device.

as saying that my example is not data collection, unless you add more
constraints, like that the server associates the screen resolution in-
formation with other data and keeps it in that form for a long time. I
assume that "network interaction" includes JavaScript code accessing
whatever APIs the browser offers (if HTTP was properly bi-directional
and the server could tell the client in a standard way "please tell me
your time zone" or something like that, there would be no difference;
referring to "the current request" has a similar problem, one request
might take many interactions if you "upgraded" a HTTP connection to a
Websocket connection; some might say loading one page is the request,
and all requests to load images and so on are sub-requests to "the"
request).

I don't think the script is collecting the information by accessing it;
so long as it stays in the browser there is no other party that comes
in control of it. By sending the information to some server, some other
party is coming in control of it, but if the data is deleted from the
server immediately and not used in any way other than skipping over it
as part of processing the request, I would have expected that you say
there is no collection at all. That seemed to be the main difference
between the "control" and "retention" formulations.

If the Working Group decides not to change the definition here, could
your concerns also be addressed by going through individual require-
ments that refer to "data collection" making sure they do not apply if
a party does not hold on to the data, as appropriate? Similarily, would
it be okay to avoid the term "data collection" entirely and using other
terms and/or concepts? Either seems easier than defining a term that as
you note lacks a clear definition when used in other contexts than DNT.
-- 
Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de
Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de
25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ 
Received on Friday, 25 May 2012 01:58:12 UTC

This archive was generated by hypermail 2.3.1 : Friday, 21 June 2013 10:11:28 UTC