Cookie linking v3

Cookies can contain a maximum of 4kb of data which must be transmitted
across the network twice before it can be used. For this reason, cookies
tend to store only a number (or unique key) which links to a value in a
database. Data about a user is stored in a database and that record is given
a number (a unique key). Only this number is stored in the cookie on the
user's computer. When the user revisits the site which set the cookie, that
site can immediately have access to a potentially unlimited amount of
information about the user simply by looking up the number in the database
in which the number is a key. Linkability may be direct or indirect. For
example a key stored in one cookie may not link directly to a user's name
but instead the user's name may be deduced by examining two cookies replayed
by separate domains but linked by a unique referrer. 

Example 1.
----------

Image A on Domain A replays a cookie linking to a record of the user's
street name and number.
Image B on Domain B replays a cookie which links to a record of the user's
home town.
Images A and B are displayed on pages with session ID's within Domain C. 
By using the referrer URI (which contains a unique session key), these two
cookies be linked together to give a unique address and through another
database, the user's name.

With enough effort, similar but more sophisticated data mining techniques
may be applied to link even seemingly highly anonymised data with cookie
values. P3P applies the principle of proportionality to such linkability.
The specification of the data and purposes covered by cookie should be
thought of in terms of the analysis which might reasonably be carried out on
such a cookie to achieve the stated purpose. For example if a cookie is set
to track criminals' personal data then it is reasonable that a considerable
effort might be put into database analysis. The cookie should therefore be
said applying to personally identifiable data even if the data is actually
hashed in the database. If on the other hand, the cookie is set in order to
track a session and data is stored in the database but anonymised by
hashing, then there is no need to state that the data is identifiable. This
type of anonymization is in theory not secure because hashes have a 1-1
correspondence with ip addresses for example so by hashing all possible ip
addresses, you can trace the original ip address. However extending the
definition of linkability to this extent is neither practical nor
reasonable.

Third party cookies are cookies which are set by a domain other than the
page being viewed. This is done through embedded images as in Example 1 and
can even occur in emails and applications which use web services, such as
music players. While normally the information stored in one domain's cookie
cannot be accessed by another domain, third party cookies bypass this
mechanism by placing the same third party image in different domain's pages.
This allows tracking of users across different domains. The intention to
carry out such tracking activities through linking cookie keys across
different domains should also be declared.

A connected problem is that a site SETTING a cookie may not be in control of
all the purposes to which a cookie is put ON REPLAY. For example some
domains set a domain level cookie (at the level *.xyz.com) which is replayed
to thousands of subhosts (a.xyz.com,b.xyz.com ...), whose data collection
practices are not under their control. Currently the solution of evaluating
cookies on replay has been discounted because of performance issues and data
protection issues linked to the act of storage of data on the user's
machine. Therefore P3P requires that entities publishing policies in
accordance with correct practice declare any potential purposes to which a
cookie might be put. It follows therefore that if a cookie is used for a new
and unforseen purpose, it SHOULD be reset along with a fresh P3P policy.


-------------------------------------
Giles Hogben
European Commission Joint Research Centre
Institute for the Protection and Security of the Citizen Cybersecurity New
technologies for Combatting Fraud Unit TP 267 Via Enrico Fermi 
 

Received on Thursday, 11 December 2003 05:31:24 UTC