W3C home > Mailing lists > Public > public-p3p-spec@w3.org > December 2003

RE: Cookie linking v3

From: Giles Hogben <giles.hogben@jrc.it>
Date: Thu, 11 Dec 2003 17:05:19 +0100
To: "'Dobbs Brooks'" <bdobbs@doubleclick.net>, "'public-p3p-spec'" <public-p3p-spec@w3.org>
Cc: "'WILIKENS MARC ANDRE'" <marc.wilikens@jrc.it>
Message-ID: <003901c3c000$92c56990$362abf8b@cs.jrc.it>

The 2 cookie webservers don't need to co-ordinate a unique ID. The referrer
session key tells them that the 2 cookies refer to the same individual and
so if they put their data together the full identity pops out after the
event, without any prior agreement.

Image A sends cookie A and referrer session id
Image B sends cookie B and the same referrer session id
So the cookie data is linked by the referrer session id

Do you think this is unlikely? I'm not sure. I could try to make the example
clearer or I could try to simplify it. One could have all three on the same
domain for example. The point was to illustrate how linking can be indirect.

>**-----Original Message-----
>**From: public-p3p-spec-request@w3.org 
>**[mailto:public-p3p-spec-request@w3.org] On Behalf Of Dobbs, Brooks
>**Sent: 11 December 2003 16:28
>**To: 'Giles Hogben'; public-p3p-spec
>**Cc: WILIKENS MARC ANDRE
>**Subject: RE: Cookie linking v3
>**
>**
>**
>**Giles,
>**
>**I agree with your first part but you lost me on the 
>**triangulation from 2 domains.  I don't think that this type 
>**of triangulation is really all that likely a scenario, 
>**remembering that Domains A and B are likely separate 
>**entities with separate web servers with no practical means 
>**of linking that cookie they each receive come from the same 
>**user (unless they have somehow coordinated to set the same 
>**unique_id).  I see where you are going with referrer and 
>**while that may give you a guess it would be a far fetched one.
>**
>**I also agree with your analysis of domain level cookies.  I 
>**think this has always been a real thorn in the side of P3P.  
>**I will make a broad generalization here, and feel free to 
>**shoot it down, - most large domains and nearly all corporate 
>**domains outsource at least one or two hosts on their domain 
>**and additionally maintain exceptionally little control of 
>**their internal hosts (bobjones.company.com).  Here is 
>**another broad generalization
>**- many, many corporate domains set a domain level unique id 
>**cookie which they claim to be pseudonymous.  I would venture 
>**a guess that that the majority of these folks also have a 
>**corporate intranet on the same domain that requires 
>**authentication usually via REMOTE_USER.  So now what is 
>**being logged on the intranet: REMOTE_USER and DOMAIN 
>**COOKIE!!!  Opps looks like Domain cookie is no longer 
>**anonymous.  I think that what all this points to is, sadly, 
>**true accuracy and compliancy are beyond the efforts that 
>**most implementers are taking at this time.
>**
>**-Brooks
>**
>**
>**-----Original Message-----
>**From: Giles Hogben [mailto:giles.hogben@jrc.it] 
>**Sent: Thursday, December 11, 2003 5:31 AM
>**To: public-p3p-spec
>**Cc: WILIKENS MARC ANDRE
>**Subject: Cookie linking v3
>**
>**
>**Cookies can contain a maximum of 4kb of data which must be 
>**transmitted across the network twice before it can be used. 
>**For this reason, cookies tend to store only a number (or 
>**unique key) which links to a value in a database. Data about 
>**a user is stored in a database and that record is given a 
>**number (a unique key). Only this number is stored in the 
>**cookie on the user's computer. When the user revisits the 
>**site which set the cookie, that site can immediately have 
>**access to a potentially unlimited amount of information 
>**about the user simply by looking up the number in the 
>**database in which the number is a key. Linkability may be 
>**direct or indirect. For example a key stored in one cookie 
>**may not link directly to a user's name but instead the 
>**user's name may be deduced by examining two cookies replayed 
>**by separate domains but linked by a unique referrer. 
>**
>**Example 1.
>**----------
>**
>**Image A on Domain A replays a cookie linking to a record of 
>**the user's street name and number. Image B on Domain B 
>**replays a cookie which links to a record of the user's home 
>**town. Images A and B are displayed on pages with session 
>**ID's within Domain C. 
>**By using the referrer URI (which contains a unique session 
>**key), these two cookies be linked together to give a unique 
>**address and through another database, the user's name.
>**
>**With enough effort, similar but more sophisticated data 
>**mining techniques may be applied to link even seemingly 
>**highly anonymised data with cookie values. P3P applies the 
>**principle of proportionality to such linkability. The 
>**specification of the data and purposes covered by cookie 
>**should be thought of in terms of the analysis which might 
>**reasonably be carried out on such a cookie to achieve the 
>**stated purpose. For example if a cookie is set to track 
>**criminals' personal data then it is reasonable that a 
>**considerable effort might be put into database analysis. The 
>**cookie should therefore be said applying to personally 
>**identifiable data even if the data is actually hashed in the 
>**database. If on the other hand, the cookie is set in order 
>**to track a session and data is stored in the database but 
>**anonymised by hashing, then there is no need to state that 
>**the data is identifiable. This type of anonymization is in 
>**theory not secure because hashes have a 1-1 correspondence 
>**with ip addresses for example so by hashing all possible ip 
>**addresses, you can trace the original ip address. However 
>**extending the definition of linkability to this extent is 
>**neither practical nor reasonable.
>**
>**Third party cookies are cookies which are set by a domain 
>**other than the page being viewed. This is done through 
>**embedded images as in Example 1 and can even occur in emails 
>**and applications which use web services, such as music 
>**players. While normally the information stored in one 
>**domain's cookie cannot be accessed by another domain, third 
>**party cookies bypass this mechanism by placing the same 
>**third party image in different domain's pages. This allows 
>**tracking of users across different domains. The intention to 
>**carry out such tracking activities through linking cookie 
>**keys across different domains should also be declared.
>**
>**A connected problem is that a site SETTING a cookie may not 
>**be in control of all the purposes to which a cookie is put 
>**ON REPLAY. For example some domains set a domain level 
>**cookie (at the level *.xyz.com) which is replayed to 
>**thousands of subhosts (a.xyz.com,b.xyz.com ...), whose data 
>**collection practices are not under their control. Currently 
>**the solution of evaluating cookies on replay has been 
>**discounted because of performance issues and data protection 
>**issues linked to the act of storage of data on the user's 
>**machine. Therefore P3P requires that entities publishing 
>**policies in accordance with correct practice declare any 
>**potential purposes to which a cookie might be put. It 
>**follows therefore that if a cookie is used for a new and 
>**unforseen purpose, it SHOULD be reset along with a fresh P3P policy.
>**
>**
>**-------------------------------------
>**Giles Hogben
>**European Commission Joint Research Centre
>**Institute for the Protection and Security of the Citizen 
>**Cybersecurity New technologies for Combatting Fraud Unit TP 
>**267 Via Enrico Fermi 
>** 
>**
>**
Received on Thursday, 11 December 2003 11:05:32 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 17 March 2004 17:46:29 EST