Re: ACTION 105: Privacy Rulesets

On Apr 15, 2010, at 5:31 PM, Alissa Cooper wrote:

> Frederick,
>
> Some answers inline...
>
> On Apr 14, 2010, at 9:44 AM, Frederick Hirsch wrote:

<snip>

>> 3. Retention - is the baseline of 35 days an industry best  
>> practice, a regulation or law, or an arbitrary time for this  
>> document? Likewise for 90 days for short.
>>
>
> The 35 days is mostly arbitrary. John will follow up with an  
> explanation of that one.

Well, it is arbitrary in that we made it up -- it does not flow from a  
particular practice, law, etc., but it does has a theory behind it.   
As privacy advocates, we would prefer to have the option, and even the  
default, to have retention = zero days.  Ideally, info should be  
discarded right after the primary purpose is completed.

But the SysAdmins say "this stuff shows up in our server logs and we  
have always kept those forever" -- to which we say "dump them -- you  
shouldn't keep them forever."  But then the engineers say "we need the  
logs to troubleshoot problems."  That is not unreasonable -- if  
retention defaults to zero, it might be hard to diagnose some problems.

So how best to accommodate the privacy advocates ("zero days"),  
sysadmins ("forever"), and engineers ("in case we need to  
troubleshoot")?  My own answer is "30 days" -- it is reasonably  
privacy protective, it provides ample time to troubleshoot in the vast  
majority of cases, and the sysadmins only have to dump the logs once a  
month (and there was never a really good reason to keep the logs  
forever anyway).

But what about 31-day-long months, and those pesky three-day-long  
weekends?  So if we say 35 days, then you can always dump your logs on  
the first business day of the month (or some similar schedule) and you  
will be fine.

So we think that a 35-day baseline for retention is something many,  
many implementors should be able to live with  just fine, unless they  
have a really good reason -- that the user wants -- to keep the data  
longer.

John

> The short attribute isn't meant to be constrained to 90 days -- 90  
> days was just what I picked to illustrate the example. So no=35  
> days, short=longer than 35 but limited somehow, and long=unlimited.  
> The short time frame could be anything, as long as the time frame  
> exists.
>
> Retention policies vary significantly across industries/ 
> applications, where they exist at all. For example, the major search  
> engines retain individualized search logs for periods ranging from 3  
> months to 18 months. Some ad networks delete data after 18 months  
> and some retain it forever. Some ISPs retain usage logs for a year  
> or two years. There is ongoing legal wrangling around all of these  
> time limits.

<snip>

Received on Friday, 16 April 2010 02:13:33 UTC