- From: Mike O'Neill <michael.oneill@baycloud.com>
- Date: Thu, 18 Jul 2013 14:21:32 +0100
- To: <public-tracking@w3.org>
- Message-ID: <07f201ce83b9$b8afa3f0$2a0eebd0$@baycloud.com>
In the last call there was some discussion on irc about mechanisms to detect unique visitors to a web page without using a persistent UID (in the context of audience measurement but it could also be for third-party web analytics). Maybe one way to do that is to filter out UAs that have not visited a page for less than an arbitrary duration, say 1 week or 1 year. The mechanism could be to use the existing Last-Modified/If-Modified-Since caching handshake. The idea is to use this to tell the server the last time this user agent visited a particular page. The server can then recognise unique visitors as those that had not visited the page for some arbitrary period. To avoid flooding the net with catchable content this could be implemented as a zero content length resource referenced in the delivered HTML (if it is an HTML resource). The page would have a reference to a "unique visit detection resource" embedded in it, say in an img or iframe tag. For example on the page /thispage.htm you would have the following invisible element: <img style="display: none;" src="/visit-detection.gif?url=/thispage.htm" /> The response from /visit-detection would be zero length content with a Last-Modified response header always set to the current time (rounded down to nearest minute say to stop fingerprinting) and Cache-Control private. The first time the UA visits there will be no If-Modified-Since header so the server detects a "unique visitor", the next time there will be an If-Modified-Since request header which will indicate the last time this particular UA visited the page. If the difference between that and the current time is more than the arbitrary period then the server again detects a "unique visitor". To detect illicit accesses this may need to be a bit more elaborate, say using a session cookie containing a hash of the current time using a secret salt, but there would be no need for a persistent unique identifier. Would this work? And would it be adequate for the purposes of audience measurement? Mike
Received on Thursday, 18 July 2013 13:22:12 UTC