Permissions and Qualifiers, redux for the end of the year from David Singer on 2012-12-20 (public-tracking@w3.org from December 2012)

From: David Singer <singer@apple.com>
Date: Wed, 19 Dec 2012 16:50:04 -0800
To: "public-tracking@w3.org Working Group" <public-tracking@w3.org>
Message-id: <C7407A5A-83F0-4BDE-B56E-F9001B2FBFC1@apple.com>

In the hope of cleaning up and making progress towards a simple and comprehensible pair of specifications, I looked at the current Permissions (compliance) and Qualifiers (Expression). I apologize for the length of the message, but, if we do this, it'll make the specs themselves simpler and shorter.

We have the following permissions:
• 6.1.2.1 Short Term Collection and Use
• 6.1.2.2 Contextual Content or Ad Delivery
• 6.1.2.3 Content or Ad Delivery Based on First Party Data
• 6.1.2.4 Frequency Capping
• 6.1.2.5 Financial Logging and Auditing
• 6.1.2.6 Security and Fraud Prevention
• 6.1.2.7 Debugging
• 6.1.2.8 Compliance With Local Laws and Public Purposes

status values:

N None: The designated resource does not perform tracking of any kind, not even for a permitted use, and does not make use of any
data collected from tracking.
1 First party: The designated resource is designed for use within a first-party context and conforms to the requirements on a first party.
If the designated resource is operated by an outsourced service provider, the service provider claims that it conforms to the
requirements on a third party acting as a first party.
3 Third party: The designated resource is designed for use within a third-party context and conforms to the requirements on a third party.
X Dynamic: The designated resource is designed for use in both first and third-party contexts and dynamically adjusts tracking
status accordingly. If X is present in the site-wide tracking status, more information must be provided via the Tk response header
field when accessing a designated resource. If X is present in the Tk header field, more information will be provided in a request-specific
tracking status resource referred to by the status-id. An origin server must not send X as the tracking status value in the
representation of a request-specific tracking status resource.
C Consent: The designated resource believes it has received prior consent for tracking this user, user agent, or device, perhaps
via some mechanism not defined by this specification, and that prior consent overrides the tracking preference expressed by this protocol.
U Updated: The request resulted in a potential change to the tracking status applicable to this user, user agent, or device. A user agent
that relies on a cached tracking status should update the cache entry with the current status by making a new request on the
applicable tracking status resource. An origin server must not send U as a tracking status value anywhere other than a Tk header field
that is in response to a state-changing request.

and qualifiers:

a Audit: Tracking is limited to that necessary for an external audit of the service context and the data collected is minimized accordingly.
c Ad frequency capping: Tracking is limited to frequency capping and the data collected is minimized accordingly.
f Fraud prevention: Tracking is limited to that necessary for preventing or investigating fraudulent behavior and security violations;
the data collected is minimized accordingly.
l Local constraints: Tracking is limited to what is required by local law, rule, or regulation and the data collected is minimized accordingly.
r Referrals: Tracking is limited to collecting referral information and the data collected is minimized accordingly.

And here are my thoughts on the Permissions.

* Short-term collection and use.

This started as a 'raw data permission' and seems to have morphed from there somewhat. The original thinking was that processing collected data to condense into databases takes time.
==> if narrowly scoped and better worded, this doesn't need a matching status qualifier; everyone keeps raw logs for some short period, and a permission everyone needs differentiates nobody

* Contextual Content or Ad Delivery

This started from a conversation about using real-time data, present in the transaction, to serve an appropriate ad. Under that guise, I don't even see why we have anything to say about it - one is neither using accumulated data, nor accumulating more data. It should be an example of something out of scope. However, the current option has "retained and used" in it, which puzzles me: "information may be collected, retained and used for the display of contextual content or advertisements".
==> make it solely about 'use', and move to a section on out-of-scope activities

* Content or Ad Delivery Based on First Party Data

The title is lightly mis-leading; it's based on data not from the current first party, but from data collected by the third party when it had first-party status. Since we don't place much limitation on either collection of data by first parties or their use of data (the major exception being that they shouldn't pass it to a third party who would not have been able to collect it themselves), this also I see as out of scope, and should be noted in a section on 'activities out of the scope of this document'.
==> move to a section on out-of-scope activities

* Frequency Capping

I think we have general agreement agreement to do frequency capping without building user profiles, and again, I see this as a candidate for the section on 'activities out of scope' -- provided it's done in such a way that you don't build a profile. (e.g. "this user has seen the Vax vacuum cleaner ad three times").
==> clarify it can't involve building user profiles, and then move to a section on out-of-scope activities

* Financial Logging and Auditing

seems settled;

* Security and Fraud Prevention
* Debugging

seem settled, and simply need a definition of 'graduated response'. I suggest "the amount of data collected varies over time according to need; a constant level of collection is not a graduated response unless it is the minimum needed for detection of issues".

This allows both 'scale up': collect a small amount of data, and collect more when concerns arise; and 'scale down': collect a lot of data after introducing a feature, and scale down as confidence in it rises; and 'sampling': collect random data snapshots to 'sample' the production; and so on.

* Compliance With Local Laws and Public Purposes

seems settled, though the language needs to make clear what the notes say

My perception is that we're not missing any permitted uses. Maybe I am wrong :-(

So, on to the qualifiers. If we
* don't bother to flag raw data collection (everyone would set it, which tells the client nothing)
* move contextual, collected as first party, and anonymous frequency capping to 'out of scope'

we then need the qualifiers (I changed the characters and aligned the names):
f Financial Logging and Auditing: Tracking is limited to that necessary for an external audit of the service context and the data
collected is minimized accordingly.
s Security and Fraud Prevention: Tracking is limited to that necessary for preventing or investigating fraudulent behavior and security violations;
the data collected is minimized accordingly.
d Debugging; data is collected solely for the purposes of ensuring service functionality
l Compliance With Local Laws and Public Purposes: Tracking is limited to what is required by local law, rule, or regulation and the
data collected is minimized accordingly.

I don't know what this one means or where it came from, but if we want it, it needs documenting in Compliance, I think:
r Referrals: Tracking is limited to collecting referral information and the data collected is minimized accordingly.

We also need, in my opinion, and as previously discussed:
! Construction in process; response headers, the well-known-resource, and compliance may be correct, or may not be; compliance is not (yet)
claimed for this site
n Not Listening; for some reason, this site is not listening to some or all users' expressed preferences. The explanation can be found by
loading the page referenced by the notlistening part of the well-known resource, which must exist.
The compliance of responses with this qualifier is unknown.
p Service Provider; may be used by a site to indicate that its status is the result of a service provision relationship with another party.

Then we add notlistening to the well-known-resource.

Service Provider

There are already some pointers for what service providers can do to avoid being mistaken for sites claiming a false status (e.g. 1st party or consent, when the user thinks they have neither).
1. If the hostname of the service provision site is unique to the party to whom service is provided, then simply put that name in the same-party array.
(Example: lytics.com operates lytics.example.com solely for example.com)
2. When the service-provision qualifier is set, then the 'policy' link of the well-known-resource must refer to the site to which service is provided (that
site may further re-direct, after that, of course). This identifies the site to whom service is provided and closes the loop. (This is already in the spec.:

"If the tracking status value is 1 and the designated resource is being operated by an outsourced service provider on behalf of a first party, the origin servermust identify the responsible first party via the domain of the policy URI, if present, or by the domain owner of the origin server."

Though it needs to say "is 1 or C" as service may be provided to 3rd parties receiving consent, as Rigo pointed out.

I think that sums it up, and aligns everything, and leaves me to wish everyone happy reading and holidays...

David Singer
Multimedia and Software Standards, Apple Inc.

Received on Thursday, 20 December 2012 00:50:31 UTC