- From: Martin Presler-Marshall <mpresler@us.ibm.com>
- Date: Tue, 3 Jul 2001 14:50:08 -0400
- To: www-p3p-policy@w3.org
There are inconsistencies in the categories given for the client IP address and its sub-elements in the base dataschema as defined in the CR specification. The CR specification contains the following categories: dynamic.clickstream.clientip: <computer/> dynamic.clickstream.clientip.hostname: <uniqueid/> dynamic.clickstream.clientip.partialhostname: <demographic/> dynamic.clickstream.clientip.fullip: <uniqueid/> dynamic.clickstream.clientip.partialip: <demographic/> It is the opinion of the specification WG that hostnames and full IP addresses belong in the <computer/> category. The <computer/> category is defined as follows: Computer Information: Information about the computer system that the individual is using to access the network -- such as the IP number, domain name, browser type or operating system. This definition clearly covers IP addresses and hostnames. Thus the dynamic.clickstream.clientip.hostname and dynamic.clickstream.clientip.fullip elements must be in the <computer/> category. In addition, the WG believes that <uniqueid/> is not an appropriate description of a full IP address or hostname. The <uniqueid/> category is defined as follows: Unique Identifiers: Non-financial identifiers, excluding government-issued identifiers, issued for purposes of consistently identifying the individual. These include identifiers issued by a Web site or service. IP addresses and hostnames are not issued for the purposes of consistenly identifying an individual. IP addresses are issued for the purpose of routing packets in a network, and hostnames are issued for the purpose of giving easy-to-remember mnemonics for IP addresses. The argument could be made that IP addresses can be used to uniquely identify an individual. This is true in some cases: computer systems which have fixed IP addresses, and which connect directly to their destination, can be identified consistently by their IP addresses. However, this is a weak mapping to an individual. Some computer systems are used by multiple individuals, and IP addresses identify a computer system only, not an individual. In addition, the presense of proxies and firewalls in the network means that a great many computer systems have their own IP address masked from the destination with which they are speaking. Furthermore, many computer systems (such as systems accessing the Internet through dialup access) have dynamically-assigned addresses which cannot easily be linked with an individual computer system. This makes it something of a stretch to describe IP addresses as unique IDs. Current Web server practices also discourage the use of <uniqueid/> to describe IP addresses. The overwhelming majority of Web servers currently in use log the requests they receive. These logs almost always contain the IP address of the computer system making the request, the URL requested, the time of the request, and other information. Placing IP addresses into the <uniqueid/> category would mean that nearly every Web site would need to declare that they collect this category of information. Doing this significantly reduces the usefulness of the <uniqueid/> category. If a user-agent chooses to look at the categories of information collected by a site, rather than the individual data elements collected, then that user-agent would be unable to discriminate between sites which collect standard Web server access logs, and those which assign unique persistent IDs (perhaps through cookies) to all visitors. It is our belief that these two practices are perceived differently by the general Web-using public, and therefore the P3P specification should reflect this distinction. Since hostnames are directly linked to IP addresses by the DNS system, and the two can be freely converted from one to another, all of the above about IP addresses applies equally well to hostnames. The last inconsistency regards the categories assigned to dynamic.clickstream.clientip. In P3P, categories must always "bubble upwards" in dataschemas. Since a policy which declares collecting structured element a.b.c implicitly includes all subelements (a.b.c.x, a.b.c.y, a.b.c.z), all categories assigned to any of the sub-elements must be assigned to their parent element. Therefore, since dynamic.clickstream.clientip.fullip and dynamic.clickstream.clientip.fullhostname are in the <computer/> category, and dynamic.clickstream.clientip.partialip and dynamic.clickstream.clientip.partialhostname are in the <demographic/> category, then their parent element - dynamic.clickstream.clientip - must be in both <computer/> and <demographic/> categories. The end result is that the following categories would be applied to these data elements: dynamic.clickstream.clientip: <computer/>, <demographic/> dynamic.clickstream.clientip.hostname: <computer/> dynamic.clickstream.clientip.partialhostname: <demographic/> dynamic.clickstream.clientip.fullip: <computer/> dynamic.clickstream.clientip.partialip: <demographic/> -- Martin Martin Presler-Marshall - Program Manager, Privacy Technology E-mail: mpresler@us.ibm.com Phone: (919) 254-7819 (tie-line 444-7819) Fax: (919) 254-6430 (tie-line 444-6430)
Received on Tuesday, 3 July 2001 14:50:28 UTC