W3C home > Mailing lists > Public > www-talk@w3.org > July to August 1995

Re: 3 Proposals: session ID, business-card auth, customer auth

From: James Pitkow <pitkow@cc.gatech.edu>
Date: Wed, 19 Jul 1995 18:35:36 -0400 (EDT)
Message-Id: <199507192235.SAA02218@hapeville.cc.gatech.edu>
To: connolly@beach.w3.org (Daniel W. Connolly)
Cc: www-talk@www10.w3.org


Dan wrote:
> In message <199507182127.RAA27079@beach.w3.org>, Roy Fielding writes:
> >
> >>******* I. The Request-ID: header field:
> >
> > (it can indeed be used to identify individuals,
> >   if the individuals are not sophisticated enough, or if the tracker
> >   is persistant).
> Please demonstrate how this is done. No fair spreading Fear,
> Uncertainty, and Doubt.

Ok.  Here's a business card that you require for site access:

Dr. Jose Cuervo
Principal, Ultrmar

1001 East Park Ave. 
Suite 253
NN NY 10100-9540 USA

Email: rksmith@ultramar.com
Web: http://www.ultramar.com/

Part of the beauty of data is that it tells you:

1) what it is
2) what it is not & it's
3) proximity to other data

For example, from the first line I can go through a log file and 
cluster people as sessions and tell you the following.  The more data,
the greater the reliability of correct identification.

which sessions are doctors
which sessions are not
which sessions may be like doctors (e.g. lawyers, brokers, etc.)

the person's gender 
which sessions are not this person's gender, etc.

the person's race
which sessions are not like this person's race, etc.

and even the person's education level
which sessions are not like this person's education level, etc.

So, one bit of data can yield to multiple inferences.
From the second line, I can tell you:

Occupation (which I can match to a database of income to occupations and
determine the income level of this individual and those that are not
like this individual, etc.)
Company (which I can match to a database of companies and get oodles of
information about this person and thus the rest of the people who
visit my site.)
... (through all the other fields and then analysis of independence amongst
the attributes and clusters, etc.)

Now, if you enable mechanisms that permit log files to contain ids across
sites AND you do not impose a policy to protect users, then the information
enabling technology called the Web, potentially becomes a disabling
technology.  To me, the Web is about information exchange, not information
concealment, elitism, or "I'll give you this only if you give me that."

Interestingly, it seems that companies on the Web are asking for more 
information about the effectiveness of their advertising then they can
get now.  When I buy a magazine off a newsstand, no one knows how long
I looked at the pages, what my name is, etc.  Instead, companies make
their decisions based upon reliable estimates of subscription rates and
the demographics that compose those readers.  How effective is
an ad in Time?  Measure it empirically for me over every issue.  

Received on Wednesday, 19 July 1995 19:11:34 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:32:57 UTC