Re: June Change Proposal, raw data, contextual ads

On Jul 2, 2013, at 10:13 , David Wainberg <dwainberg@appnexus.com> wrote:

> 
> On 7/1/13 6:58 PM, David Singer wrote:
>> On Jul 1, 2013, at 14:56 , David Wainberg <david@appnexus.com> wrote:
>> 
>>> Hi David-
>>> 
>>> Is this also related to ISSUE-142?
>> Yes, though that's marked closed.
> I don't know why or when. I think I'm not alone in feeling I've lost track of the issue management. I don't recall this issue being particularly controversial, but others can speak up if I'm wrong about that.

I get lost a bit too…but if there is something to learn from this issue, I don't really care what its state is.

>> 
>>> I thought there'd been some agreement previously on a grace period for data retained only for a short period, and that such grace period would include more than merely processing it into an unregulated state.
>> What other uses do you envisage?  I thought the rationale for the raw data exception was (a) that no-one can process the data in real time, there needs to be some holding period and (b) some people hold the raw data for a while 'just in case' (e.g. of a debugging need) but can process into what they need at that time and discard the rest.
>> 
>> In neither case do I see a need to allow 'other use' than the ability to process, but if there is a (closed, non-leaky) one, it would be good to hear.
> Does it depend on the problem we're trying to solve? If the primary concern is, as some have said, to limit the accumulation of histories of users' online activity, then deleting or de-identifying the data within a short timeframe would satisfy that, right?

yes, but not if the raw data can be 'used' or 'shared' before it's been processed.  That would be a 'leak'.  At the moment, I see the pipeline being:

1) a transaction happens.  use what you need that's present in that transaction, to satisfy it (reply, serve an ad, whatever).
1b) there might be some debate (I can't work out if this is what Jonathan is asking) over whether you can collect *extra* information about a DNT:1 user.  I think we may be able to be silent on this.

2) You keep raw log data from the transaction, for later processing.  For permitted uses (this included), you get these three rules
  i) only what you need:  for raw data, this doesn't tell us much
  ii) for as long as you need it:  yes, until you can process it.  once you show you can process it, you have to delete the raw data
  iii) not used for any other purpose.

   2a) As a result of (iii) nothing is done with this data until processing happens (no sharing, no use, no inspection).  We allow the retention of raw data solely because it's impossible to do real-time processing.

3) At the time of processing, you sift the data into four buckets:
  a) data that is out of our scope; it's not 'tracking' data as defined.  e.g. you remember that had one more view of a given ad -- nothing controversial or tracking about remembering that; or the data is de-identified, and it's not possible to track-back to the actual user any more, and so on.
  b) data that is in scope, is tracking data, but that you retain under a claimed permitted use (with the restrictions that go with that permitted use and permitted uses in general)
  c) data that is in scope, is tracking data, but that you retain because you have consent
  d) the trash bucket, data that you throw away.

4) For permitted use data, you only retain what you need for as long as you need it, and it's then thrown away also.

So, what remains indefinitely is either not-tracking-data, or you have consent.

The question I had was that you seemed to be proposing some relaxation of 2a, allowing some use/sharing/etc., and I couldn't work out what.

>> 
>>> But why do we need to say that raw data can be processed?
>> I think I am saying that that is the 'purpose' of the raw data for the 'permission' -- the only 'purpose' one has in it, is to process it.
>> 
>>> Isn't permission to process data into an unregulated state inherent in the spec?
>> Not that I see, and I think it's worth making explicit.
>> 
>>> I'm not seeing the need for this explicit statement about "raw data". In any case, don't we mean to say something like "under DNT:1 tracking information may be used but must be deidentified/delinked within N days" as in ISSUE-142?
>> Do you happen to know what text resulted from 142?  It might help me to review it.
> Is there a way to access prior drafts of doc?
> 

Yes, but you need to know their names.  There are snapshots with the following names (and dates) in the directory:

Nov 13  2011 tracking-compliance-20111114.html
Mar 13  2012 tracking-compliance-20120313.html
Aug  6  2012 tracking-compliance-20120523.html
Oct  2  2012 tracking-compliance-20121002.html
Mar  6 15:02 tracking-compliance-20121030.html

and of course
tracking-compliance-june.html
tracking-compliance.html

hope this helps


David Singer
Multimedia and Software Standards, Apple Inc.

Received on Tuesday, 2 July 2013 17:29:20 UTC