web site P3P adoption report

We released the following report today: "An Analysis of P3P Deployment 
on Commercial, Government, and Children’s Web Sites as of May 2003"

You can download it from:
http://www.research.att.com/projects/p3p/p3p-census-may03.pdf

Here is the executive summary:

The Platform for Privacy Preferences (P3P) provides a standard 
computer-readable format for privacy policies and a protocol that 
enables web browsers to read and process these policies automatically. 
We developed software to query a set of web sites for P3P policies, 
check the validity of each policy, and analyze the information 
practices it describes. We used this software to analyze 538 
P3P-enabled web sites found by checking for P3P policies on 5,856 web 
sites on 6 May 2003. The sites we checked for P3P policies were taken 
from several lists of popular web sites, as well as from “crawling” 
indexes of shopping, news, children’s and government web sites. We 
present the first major analysis of the data practices of P3P-enabled 
web sites.

Our system used the P3P evaluation engine built into the AT&T Privacy 
Bird P3P user agent to analyze the P3P policies we discovered. We 
checked these policies against Privacy Bird’s standard “high,” 
“medium,” and “low” settings as well as against 62 other “rule sets” 
that we developed.

A comparison of our results with previous studies indicates that P3P 
adoption is increasing over time [1][14]. Adoption remains highest for 
the most popular web sites.

We found a large number of errors in the P3P policies of the sites we 
evaluated. About one third of the P3P-enabled sites had technical 
errors. In many cases these errors were due to use of syntax from a 
draft version of the P3P specification that is not permitted by the 
final P3P 1.0 Recommendation [6]. However, 7% of the P3P-enabled sites 
had critical errors that prevented their evaluation by our Privacy Bird 
evaluation engine. We also found 74 sites that violated the P3P 
specification by posting P3P compact policies without their 
corresponding full P3P policies.

Our analysis of data collection at P3P-enabled web sites indicates that 
most sites collect computer information, click stream information, and 
demographic data. Almost as many sites also collect online contact 
information, physical contact information, interactive data, and unique 
identifiers. The majority of sites also collect preference information, 
purchase information, and state management information (cookies). 
However, fewer sites collect financial information (which excludes 
information such as credit card numbers used only as part of a 
purchase). The least collected information is content (email messages, 
bulletin board postings, etc.), government-issued identifiers, health 
information, political information, location information (for example 
GPS positioning data), and information not falling into any of the 
pre-defined categories.

Almost all web sites reported using data for completion and support of 
the activity for which data was provided, web site and system 
administration, and research and development. The majority of sites 
also reported using data for email and postal mail marketing, one-time 
tailoring of the site content, and pseudonymous profiling. 
Substantially fewer sites reported using data for telemarketing or 
profiling in which individuals are identified. Very few sites reported 
using data for historical preservation or other purposes that do not 
fall into these categories.

About half the web sites we studied indicated that they share 
personally identifiable data with parties other than agents who use 
data for the purpose for which it was provided. 46% of these sites 
indicate that they offer third-party choice (opt-in or opt-out to this 
sharing).

Most web sites reported providing some access provisions for 
individuals wishing to find out what data of theirs was in a web site’s 
records as well as some dispute resolution option for disputes related 
to their privacy policy. However, most sites reported that they did not 
have a data retention policy covering all of the data they collected at 
their site.

As debates continue about the need for further privacy legislation and 
the effectiveness of industry self-regulation in the privacy area, 
automated analysis of the data practices of P3P-enabled web sites can 
provide valuable information. Furthermore, as US government web sites 
begin posting P3P policies to comply with the privacy requirements of 
section 208 of the E-Government Act of 2002 [16], we can monitor 
compliance with these requirements. We plan to repeat our experiments 
on a regular basis to allow for longitudinal analysis of P3P policies. 
In the future we may also expand the list of web sites we analyze, 
develop additional rule sets to facilitate more detailed analysis, and 
expand our analysis to include P3P compact policies.

Received on Wednesday, 14 May 2003 22:19:03 UTC