Re: ISSUE-16, ACTION-166: define (data) collection

Hi Roy and Björn, 

very nice definition and very informed discussion. I think we should not 
overdue it. I think that retention and processing are key to collection. 
Assuming collection without retention and processing is like punishing you 
for eavesdropping because you heard your neighbor in the train shouting at 
his mobile phone. 

The browsers have their own responsibility for all the chatter. 

Roy, 

I wonder if you could live with the removal of " or in a form that remains 
linkable to a specific user, user agent, or device." because this imports 
the entire "what is identified and identifiable" into the definition of 
"collection". 

I want to have the discussion about what is identifiable in my retained data 
in a separate bucket outside of the collection issue. Could that work. I 
don't mind the discussion, I just find the place to have it confusing. 

Rigo

On Friday 25 May 2012 03:57:42 Bjoern Hoehrmann wrote:
> * Roy T. Fielding wrote:
> >On May 23, 2012, at 3:40 PM, Bjoern Hoehrmann wrote:
> >> When you visit a web page and a script on it determines the resolution
> >> of your screen or your timezone or whatever, and sends that information
> >> to some server, I would say someone or something is collecting that in-
> >> formation,
> >
> >Yes, the script is collecting it from the scripting environment
> >and then passing the data to another outside that environment.
> 
> There is no retention in the example above. I see taking the timezone
> information the same as taking a photograph, and sending that on to a
> server the same as making the browser send a cookie, but I understood
> that there needs to be some form of "retention" for "collection". I'd
> read your proposal
> 
>    "Data collection" (for the purpose of DNT) is the process of
>     assembling data from or about one or more network interactions
>     and retaining/sharing that data beyond the scope of responding
>     to the current request or in a form that remains linkable to a
>     specific user, user agent, or device.
> 
> as saying that my example is not data collection, unless you add more
> constraints, like that the server associates the screen resolution in-
> formation with other data and keeps it in that form for a long time. I
> assume that "network interaction" includes JavaScript code accessing
> whatever APIs the browser offers (if HTTP was properly bi-directional
> and the server could tell the client in a standard way "please tell me
> your time zone" or something like that, there would be no difference;
> referring to "the current request" has a similar problem, one request
> might take many interactions if you "upgraded" a HTTP connection to a
> Websocket connection; some might say loading one page is the request,
> and all requests to load images and so on are sub-requests to "the"
> request).
> 
> I don't think the script is collecting the information by accessing it;
> so long as it stays in the browser there is no other party that comes
> in control of it. By sending the information to some server, some other
> party is coming in control of it, but if the data is deleted from the
> server immediately and not used in any way other than skipping over it
> as part of processing the request, I would have expected that you say
> there is no collection at all. That seemed to be the main difference
> between the "control" and "retention" formulations.
> 
> If the Working Group decides not to change the definition here, could
> your concerns also be addressed by going through individual require-
> ments that refer to "data collection" making sure they do not apply if
> a party does not hold on to the data, as appropriate? Similarily, would
> it be okay to avoid the term "data collection" entirely and using other
> terms and/or concepts? Either seems easier than defining a term that as
> you note lacks a clear definition when used in other contexts than DNT.

Received on Friday, 25 May 2012 16:45:26 UTC