Re: Session tracking

Brian Behlendorf <brian@organic.com> writes:
> "Clickstreams" are the paths people take when they
> traverse your site - many content providers would find it
> useful to be able to detect common patterns or the
> effectiveness of various user interfaces.
>
> So, I'd like to propose for discussion a new HTTP header
> (hi Roy!) called  "Session-ID".  This would be optional,
> of course, and it would change any  time the browser is
> restarted (or when the user wished).

This is an excellent idea.  With Referer logging, you can already  
produce a "Markov model" for your Web site, giving transition  
probabilities between pages.  But it would be interesting to find  
out just how independent link choices really are; i.e., once a user  
gets to a page, how much does it matter where they came from?  To  
the extent that it matters, the Markov model is inaccurate.

> Given that more than one person can use a hostname (proxy
> servers, etc), there's no reliable way to exactly identify
> a unique person without implementing access control

Yes, and the statistics of access intervals don't help.  Intervals  
between requests from the same host seem to follow a combination of  
two very distinct exponential distributions whose decay rates  
differing by over an order of magnitude; presumably the long-term  
exponential represents intervals between user sessions through the  
same gateway host.  But the problem with exponential distributions  
is that the maximum probability occurs at zero, no matter how long-  
or short-term they might be...

--------------------------------------------------------------------
Paul Burchard	<burchard@math.utah.edu>
``I'm still learning how to count backwards from infinity...''
--------------------------------------------------------------------

Received on Tuesday, 18 April 1995 02:30:27 UTC