- From: James Pitkow <pitkow@cc.gatech.edu>
- Date: Wed, 19 Jul 1995 18:35:36 -0400 (EDT)
- To: connolly@beach.w3.org (Daniel W. Connolly)
- Cc: www-talk@www10.w3.org
Hello, Dan wrote: > In message <199507182127.RAA27079@beach.w3.org>, Roy Fielding writes: > > > >>******* I. The Request-ID: header field: > > > > (it can indeed be used to identify individuals, > > if the individuals are not sophisticated enough, or if the tracker > > is persistant). > > Please demonstrate how this is done. No fair spreading Fear, > Uncertainty, and Doubt. Ok. Here's a business card that you require for site access: Dr. Jose Cuervo Principal, Ultrmar 1001 East Park Ave. Suite 253 NN NY 10100-9540 USA Email: rksmith@ultramar.com Web: http://www.ultramar.com/ Part of the beauty of data is that it tells you: 1) what it is 2) what it is not & it's 3) proximity to other data For example, from the first line I can go through a log file and cluster people as sessions and tell you the following. The more data, the greater the reliability of correct identification. which sessions are doctors which sessions are not which sessions may be like doctors (e.g. lawyers, brokers, etc.) the person's gender which sessions are not this person's gender, etc. the person's race which sessions are not like this person's race, etc. and even the person's education level which sessions are not like this person's education level, etc. So, one bit of data can yield to multiple inferences. From the second line, I can tell you: Occupation (which I can match to a database of income to occupations and determine the income level of this individual and those that are not like this individual, etc.) Company (which I can match to a database of companies and get oodles of information about this person and thus the rest of the people who visit my site.) ... (through all the other fields and then analysis of independence amongst the attributes and clusters, etc.) Now, if you enable mechanisms that permit log files to contain ids across sites AND you do not impose a policy to protect users, then the information enabling technology called the Web, potentially becomes a disabling technology. To me, the Web is about information exchange, not information concealment, elitism, or "I'll give you this only if you give me that." Interestingly, it seems that companies on the Web are asking for more information about the effectiveness of their advertising then they can get now. When I buy a magazine off a newsstand, no one knows how long I looked at the pages, what my name is, etc. Instead, companies make their decisions based upon reliable estimates of subscription rates and the demographics that compose those readers. How effective is an ad in Time? Measure it empirically for me over every issue. Regards, Jim.
Received on Wednesday, 19 July 1995 19:11:34 UTC