action-235 Draft middle way draft on permitted uses from Nicholas Doty on 2012-09-12 (public-tracking@w3.org from September 2012)

From: Nicholas Doty <npdoty@w3.org>
Date: Tue, 11 Sep 2012 21:36:15 -0700
To: "public-tracking@w3.org (public-tracking@w3.org)" <public-tracking@w3.org>
Message-Id: <5FA89E36-87CA-4396-AE92-0333894249AF@w3.org>

I've included below what could be a "middle way" on permitted uses, based on what I've heard from the group. ("Middle" is not to meant to imply any particular measurements or distances, of course, just somewhere between the proposals we've heard.) The text below is long, but provides full text and explanatory text as well. It is structured as a limited list of changes from the current editors' draft: this is not a brand new proposal but trying to take ideas from the proposals presented and the discussions we've had in the group already.

Thanks to all who talked with me and provided feedback, and to the full group for all the constructive feedback I expect to get in response to this email.
—Nick

# A middle way

## Guiding principles

* Prefer flexibility for implementers, except where necessary to define the meaning of, and compliance with, a preference.
** Appendices and non-normative guidance may be useful for setting expectations and providing safe harbors without overly limiting implementations.
* The goal is not to find a proposal that everyone finds ideal, or in fact that any one group finds ideal, but a proposal that provides a meaningful privacy choice for users and is widely implementable.

### Motivations for permitted uses

* In general, Do Not Track should prevent retention of data collected in a third-party context. In particular, data should not be retained that may later alter a user's experience.
* Tracking that does not imply a user's Web browsing history or profiled characteristics has been identified by the group as less problematic. (More details below, but 'has-seen-refrigerator-ad-three-times' is less of a concern than a list of every URL visited.)
* Tracking by a third-party that is consistent with the context of the interaction is generally compliant with a Do Not Track preference.

## Permitted uses for third parties

* Short-term logging: An N-week period is allowed to convert identifiable logs of DNT requests into unlinkable data or minimize data to the set of permitted uses.

Operators MAY retain data related to a communication in a third-party context for up to 6 weeks. During this time, operators may render data unlinkable (as described above) or perform processing of the data for any of the other permitted uses.

Changes from the editors' draft: Accept the current "option" text, clarify that the uses are only for creating unlinkable data or the other enumerated permitted uses, but the basic intent appears to be the same. The most common descriptions of this short-term period seem to be 5-6 weeks, such that common implementations of keeping logs for a month before processing/deleting would not be prevented.

Note: Agree that providing examples and non-normative text could be useful.

Question: is this really a "permitted use"? It might better be framed as context for the general requirements, that retention beyond common short-term logging is what is generally prohibited.

* Contextual content, ad delivery, personalization from first party data: mark this as explicitly out of scope in the general prohibition at the start of this section, as long as data is not retained (in which case it would need to fall under another permitted use or be prohibited).

[No text.]

Changes from the editors' draft: Remove these "permitted uses", add clarifications to definitions of terms earlier in the spec.

Question: do we have consensus in the group on third-party personalization or behavioral targeting based on data collected in a first-party context? (In any case, doesn't need to be part of the list of permitted uses.)

* Frequency capping: Tracking (retention of data linkable to a user, UA or device) is permitted where that data (counts at a super-campaign level) does not reveal the user's browsing history.

Retaining and using data for frequency capping of online advertisements is allowed if the tracking identifier is only retained in a form that is unique to each super-campaign (e.g., one-way hashed with a campaign id) and does not include retention of the user's browsing history or activity trail (page URIs on which the ads were delivered). Implementers SHOULD NOT create detailed profiles of user browsing activity or user behavior based on their ad frequency history, for example, by retaining identifiers unique to ad impressions served on individual pages.

Changes from the editors' draft: Accept the current "option" text from Seattle. Remove "aside from what is allowed for other permitted uses" which creates an unintended ambiguity. Add SHOULD requirement.

Note: We could add non-normative examples (or perhaps in an appendix), describing client-side and server-side frequency capping techniques that are compatible with this requirement.

* Financial reporting and auditing: To the extent required by law, third parties may engage in tracking as is reasonably necessary for financial reporting and auditing. Billing and auditing ad impressions and interactions is within the context of delivering and displaying an ad.

To the extent required by law, third parties may engage in tracking as is reasonably necessary for financial reporting and auditing. Data necessary for recording unique ad impressions, positions and interactions may be retained for this permitted use.

Changes from the editors' draft: Replace existing text with the above.

Add reference to the previously agreed upon legal compliance text, likely in the general requirements section.
("Adherence to laws, legal and judicial process, and regulations take precedence over this standard when applicable, but contractual obligations do not.")

Non-normative explanation: This permitted use is intended to provide for recording, reporting and verifying delivery of and interaction with online advertising. It is not intended to cover tracking a user's ad impression to combine with later browsing activity, subsequent conversion, frequency capping or ad sequencing. Note that when a user meaningfully interacts with an ad or widget, such interaction is in a first-party context (and therefore not restricted by these third-party requirements).

Question: is the group comfortable with retaining identifiable data for auditing services not commonly required by law?

* Security and fraud: To the extent reasonably necessary for protection of computers and networks and to detect ad or other fraud, third parties may engage in tracking. Use of graduated response is preferred.

Operators MAY retain data related to a communication in a third-party context to use for detecting security risks and fraudulent activity, defending from attacks and fraud, and maintaining integrity of the service. This includes data reasonably necessary for enabling authentication/verification, detecting hostile transactions and attacks, providing fraud prevention, and maintaining system integrity. In this example specifically, this information MAY be used to alter the user's experience in order to reasonably keep a service secure or prevent fraud. Operators SHOULD use graduated or triggered responses where feasible.

Changes from the editors' draft: Add graduated/triggered SHOULD requirement. (Rather than determining within this WG what measures are necessary, we prefer to allow necessary measures and then self-regulatory bodies and regulatory agencies can determine case-by-case.)

Question: is "fraud" (the most relevant case here is ad impression fraud) more appropriate under the "Financial reporting" permitted use? Heather notes this here: http://lists.w3.org/Archives/Public/public-tracking/2012Jun/0636.html

* Debugging: To the extent reasonably necessary for inspection of product bugs and performance, third parties may engage in tracking. Use of graduated response is preferred.

Operators MAY retain data related to a communication in a third-party context to use for identifying and repairing bugs in functionality. As described in the general requirements [reference to Minimization section], services MAY collect and retain data from DNT:1 users ONLY when reasonably necessary to identify and repair errors in functionality. Services SHOULD use graduated responses where feasible.

Changes from the editors' draft: Clarify that this is for short-term investigation and bug fixing rather than an open-ended permission. Add minimization requirement/pointer.

Non-normative explanation: This permitted use is intended for short-term diagnosis and repair of third-party Web functionality, commonly in real time. Long-term retention of all data is not compatible with this permitted use. This permitted use is not intended to cover broad quality assurance measurements.

Notes from Joanne from breakout group on debugging: http://lists.w3.org/Archives/Public/public-tracking/2012Jun/0635.html

* Research: collection and use of identifiable data for market research or other longitudinal aggregation purposes is not generally within the context of a particular request; only unlinkable data may be retained for this purpose. As described above, identifiable data can be stored during short term logging to generate aggregate reports.

Changes from the editors' draft: Remove "Aggregate Reporting" section. Ensure that unlinkable data is prominently declared out of scope of these requirements earlier in the document. Ensure that the "Short Term" permitted use makes it clear that retaining identifiable data for the short term is allowed for creating aggregate reports.

### General Requirements

Note: these general requirements apply across all the permitted uses. There's a good chance these should actually come before the section of permitted uses.

* Legal Compliance: as previously agreed, legal requirements overrule prohibitions of this standard, though contractual obligations do not.

Adherence to laws, legal and judicial process, and regulations take precedence over this standard when applicable, but contractual obligations do not.

Changes from the editors' draft: Replace "Compliance With Local Laws and Public Purposes" section with previously agreed upon text (5/23/2012).

* Identifiers: flexibility is provided to implementers on how they accomplish permitted uses and minimize data retention and use. Implementers are advised to avoid data collection for DNT:1 users where feasible to enable external confidence.

Placing third-party cookies with unique identifiers (and other techniques for linking data to a user, user agent or device) are permitted where reasonably necessary for a permitted use. Requirements on minimization and secondary use, however, provide limitations on when any collection technique is compatible with a Do Not Track preference and what the implications of that collection are.

To give flexibility to implementers in accomplishing the requirements of this specification and the listed permitted uses, no particular data collection techniques are prescribed or prohibited.

Implementers are advised that collection of user data under a Do Not Track preference (including using unique tracking cookies or browser fingerprinting) may reduce external auditability, monitoring and user confidence and that retention of such data may imply liability in certain jurisdictions in cases of secondary use; for more information, see the Global Considerations.

* Minimization

A third party MUST ONLY retain information for a permitted use for as long as is reasonably necessary for that use. Third parties MUST make reasonable data minimization efforts to ensure that only the data necessary for the permitted use is retained. A third party MUST provide public transparency of their data retention period; third parties may enumerate each individually if they vary across Permitted Uses. Once the period of time for which a party has declared data retention for a given use, the data must not be used for that permitted use. After there are no remaining Permitted Uses for given data, the data must be deleted or rendered unlinkable.

Where feasible, a third party SHOULD NOT collect linkable data when that data is not reasonably necessary for one of the permitted uses. In particular, data not necessary for a communication (for example, cookie data, URI parameters, unique identifiers inserted by a network intermediary) MUST NOT be retained unless reasonably necessary for a particular permitted use.

Changes from the editors' draft: Add collection limitation requirements.

Note: it may be that this is the only time a requirement/prohibition is necessary regarding "collection". All other requirements would be prohibitions on retention (beyond what is necessary, or beyond a short-term logging period) or sharing. A definition of collection, then, is only needed for this minimization concept. "Tracking" can be defined through "retention", "use" and "share" only.

* Secondary Use

A third party MUST NOT use data retained for a particular permitted use for any other purpose.

Changes from the editors' draft:
Clarify that data retained for one purpose cannot be re-purposed (even if the second purpose might be related to another permitted use).

Note: This does not require keeping separate copies of data for different permitted uses (agreement in Seattle that a single copy is allowable), but does require that data retained for one stated purpose cannot be repurposed, even in aggregate form. (See resolution at the end of: http://www.w3.org/2012/06/21-dnt-minutes#item08)

* No Personalization

Outside of security and frequency capping, data retained for permitted uses must not be used to alter a specific user's online experience.

Changes from the editors' draft:
Remove "based on multi-site activity" from the "No Personalization" section.

### Appendices

Non-normative text, in a separate appendix, will provide suggestions for privacy-preserving techniques and data minimization approaches. Implementers are not required to use any particular technique to fulfill the requirements in this specification, but a listing of techniques might provide useful guidelines.

Implementers are advised to consider proportionality, user expectations and user consent in development and deployment. As described in the Global Considerations document, proportionality of data collection to purpose may be a legal requirement in some jurisdictions.

Received on Wednesday, 12 September 2012 04:36:26 UTC