W3C home > Mailing lists > Public > www-tag@w3.org > August 2011

Re: how does host B know that its visitor is the one that visited host A?

From: Mischa Tuffield <mischa@mmt.me.uk>
Date: Mon, 15 Aug 2011 17:41:38 +0100
Cc: Jonathan Rees <jar@creativecommons.org>
Message-Id: <08716CC1-7644-433F-BDD0-71DA3579CA1A@mmt.me.uk>
To: www-tag@w3.org
Hi Jonathan/All, 

Please excuse the top-post. 

As discussed in this thread it could be one of many things which made you see what you saw when you were browsing the web. I thought I would give some pointers to work highlighting some of the issues discussed in this thread: 

1. Cookies and how "deleting cookies" is not enough anymore. Cookie re-spawning, via the use of the Flash Storage, HTML5's local storage, or more recently Etags and JS - is an ongoing struggle. This is explained by Ashkan Soltani on this website [1]. The evercookie[2] got some press when it described how to create persistent cookies about 6 months ago. Wired magazine and Bruce Schneier have also talked about a new e-tag based approach for persisting cookies[3]. Quoting: 

.... "in addition to Flash and HTML5 LocalStorage, <SOMECOMPANY> was exploiting the browser cache to store persistent identifiers via stored Javascript and ETags" ...

The above suggests that even if you choose to have no cookies, no local storage, and no flash based storage, you are still subject to tracking. 

2. Balachander Krishnamurthy [4], has done some excellent work highlighting how much information is circulated between the various ad-networks. And how many of the ad-networks are owned by a handful of companies. The slides for a plenary talk he gave at an IETF workshop summarises some truly awesome work, and it definitely worth a browse [5].

3. And finally, with regarding to UA and browser fingerprinting, the EFF have an excellent tool [6] which highlights how one's browser UA is "most likely uniquely identifiable". T. Lowenthal had a position paper at the W3C's Web Tracking and User privacy workshop [7], where he put it to the browser vendors to start protecting their users' privacy, one of his suggestions was "Fingerprint Uniqueness Reduction", which seems very sensible to say the least. Saying that given that your experience was cross-browser, this probably wasn't what was happening to you. 

Regards, 

Mischa *goes back to lurking ...

[1] http://ashkansoltani.org/docs/respawn_redux.html 
[2] http://samy.pl/evercookie/ 
[3] https://www.schneier.com/blog/archives/2011/08/new_undeletable.html 
[4] http://www2.research.att.com/~bala/papers/ 
[5] http://www.ietf.org/proceedings/77/slides/plenaryt-5.pdf 
[6] https://panopticlick.eff.org/ 
[7] http://www.w3.org/2011/track-privacy/papers/lowenthal_position-paper.pdf 

_________________________________
Mischa Tuffield PhD
Email: mischa@mmt.me.uk
Homepage: http://mmt.me.uk/
WebID: http://mmt.me.uk/foaf.rdf#mischa


On 14 Aug 2011, at 15:45, Mukul Gandhi wrote:

> Hi Jonathan,
> 
> On Fri, Aug 12, 2011 at 8:41 PM, Jonathan Rees <jar@creativecommons.org> wrote:
>> How does this work? I.e. what are browser instances doing that leaks
>> their identity to servers? Is it just a lucky guess based on
>> User-agent or something?
> 
> I believe, that the "User-Agent" HTTP request header field is a
> reliable way for a server to know, that with which user agent (usually
> a web browser) it is sending response to.
> 
> Here's an excerpt from the document
> http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html, which explains
> this,
> 
> <quote>
> 
> 14.43 User-Agent
> 
> The User-Agent request-header field contains information about the
> user agent originating the request. This is for statistical purposes,
> the tracing of protocol violations, and automated recognition of user
> agents for the sake of tailoring responses to avoid particular user
> agent limitations. User agents SHOULD include this field with
> requests. The field can contain multiple product tokens (section 3.8)
> and comments identifying the agent and any subproducts which form a
> significant part of the user agent. By convention, the product tokens
> are listed in order of their significance for identifying the
> application.
> 
>       User-Agent     = "User-Agent" ":" 1*( product | comment )
> 
> Example:
> 
>       User-Agent: CERN-LineMode/2.15 libwww/2.17b3
> 
> </quote>
> 
> I think nearly every web browser sends this field (and it's value) to
> the web server, it is sending a request to.
> 
> 
> 
> 
> -- 
> Regards,
> Mukul Gandhi
> 
Received on Monday, 15 August 2011 17:45:33 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:48:39 GMT