- From: Roy T. Fielding <fielding@gbiv.com>
- Date: Wed, 23 May 2012 18:42:11 -0700
- To: Bjoern Hoehrmann <derhoermi@gmx.net>
- Cc: "public-tracking@w3.org Group WG" <public-tracking@w3.org>
On May 23, 2012, at 3:40 PM, Bjoern Hoehrmann wrote: > * Roy T. Fielding wrote: >> I think we all should understand that collection implies gathering >> together and at least some form of retention. The above joke by >> Steven Wright depends on the audience knowing that. We can collect >> seashells by taking them off the beach, not by merely walking by them. >> We can collect photos of seashells by taking each one's picture >> and retaining that picture, not by snapping the shot and then >> deleting it from memory. > > I do agree there is an element of "retention" in "collection", but your > interpretation seems to imply that you can do certain things with data > even though you have never collected that data, and I think some people > would find that contradictory. Well, they'd be wrong. Just consider how much data is processed in HTTP header fields without anyone even bothering to log it. Use does not imply collection. > The joke depends on the idea that all the seashells on the beaches on > the planet have come under the control of Steven Wright at some point > who then put them roughly where they are today. That is a surprising > idea if you usually assume that no human being could or would do that. I didn't mean to imply that he has to move the shells in order to collect them -- people have art collections that are physically located in museums around the world. The joke is a little more subtle than that. Perhaps we shouldn't be mixing nouns with adjectives. > If you are at some beach and pick up a seashell and then throw it at a > specific location, are you collection seashells in that place? It depends on what time scale we are talking about and how long that place (or that shell) remains under your control. > What if you throw them across a border you cannot cross? If the border is a black hole, no. If someone on the other side has asked you to toss them into their control, then yes. > What if you throw the shells into a bucket filled with hydrochloric acid? No. > What if you are a magician, ask people to give you seashells, put them somewhere, and if people look at where you apparently put them, they are not there, so, > did you actually collect the shells? Magicians don't need exemptions. > Your analogies suffer from a number of problems, if you walk past the > seashells they do not actually come under your control. Yes, they do come under my control (and within my control, as phrased by the current document. I grew up on a beach, so I am quite familiar with the concept. > And photos of them, well, you are presenting a white box example. When a stranger > follows you around taking photos of you, you might worry that they are > collecting photos of you, and would still do so if you confront them > and they say, oh, the camera deletes all the pictures from memory. Yes, you might worry that they are collecting them, which is why you confront them and delete the pictures from memory -- to assuage that data collection worry. It doesn't imply that the person is performing data collection on you -- they might very well be taking non-identifiable pictures of a bug that landed on your backside. I'd still worry, but for other reasons. > When you visit a web page and a script on it determines the resolution > of your screen or your timezone or whatever, and sends that information > to some server, I would say someone or something is collecting that in- > formation, Yes, the script is collecting it from the scripting environment and then passing the data to another outside that environment. > even if it does not last long on the server. If the server only uses it in responding to the request in which it appears, then the server has not collected the data. It has certainly used it. The owner of the web page is still responsible for the data collection by the script, and whatever happens as a result of sharing the data, but that's independent of how we describe what happens once the data reaches the server. > They gather it in one place, on that server. I would think if some web service says it > does not collect information on user's screen resolution, but a script > quite obviously obtains such information from the browser and sends it > back to the service, people would feel mislead. Some people might not understand collection, in that context, would imply the server will save the user screen resolution for later use. There is no additional privacy implication to using data that has already been provided by the user. There is privacy concern for obtaining data from the user's device that it has not already agreed to share, so it is the script's data collection that matters; how that might be successfully communicated to a user is far more complicated than just "data collection". I can collect flowers for myself. I can collect flowers and then give them to someone else. If I just pick flowers and throw them away while walking down the street, I have not collected them. > (Consider the same point for information that is not usually used to > adapt content, like which web pages you have recently visited or which > fonts you have installed; would it be wrong to accume someone that they > are "collecting" this information if their web pages obtain this and > also transmit it back to the server, if there is no particular reason > to do so for content adaption purposes?) Umm, both of those are used to adapt content. If that information is sent out of the private context of the browser and to the server, then it has been collected. If it is merely used on the client to select from various alternatives, it has not been collected, though care must be taken not to expose the information accidentally in later requests. Likewise, a cookie set by a server and then received by that same server is not data collection -- it's just use. Correlating browser activity over time is data collection, whether or not it is made easier by use of a unique identifier. Retention of activity across multiple websites by virtue of a shared identifier is also data collection (of the shared activity). I think all of that is covered by my definition. Maybe you could respond to that instead of just responding to the analogy. ....Roy
Received on Thursday, 24 May 2012 01:42:37 UTC