RE: issue-199

Rigo,

I believe you've grossly misinterpreted the industry proposal on this point.  Yellow data is not "off the hook" - it can only be used for analytical purposes.  And no one has felt that this analysis would allow for behavioral fingerprinting of specific users - so happy to take that off the table - please provide proposed language.  The goal is to truly make yellow data (de-identified but event linkable) only usable for "aggregate analysis".  To be clear, this could result in analytics that say for a specific web page users generally click on ads placed in position B over position A so begin to show all ads in position B.  So the results may be generally applied but not specifically applied to a single individual.

I hope this clears up the situation.

- Shane

-----Original Message-----
From: Rigo Wenning [mailto:rigo@w3.org] 
Sent: Wednesday, July 10, 2013 11:25 AM
To: Shane Wiley
Cc: public-tracking@w3.org; Mike O'Neill; 'achapell'; npdoty@w3.org; tlr@w3.org; jeff@democraticmedia.org
Subject: Re: issue-199

Shane, 

I think you don't address my point. 

I do not question that your suggestion is a useful privacy enhancement. 
I applaud the effort. I'm glad Yahoo sees privacy as a competitive advantage in the market. Yes, we should have greater permissions for those making such privacy enhancements and allow to have a good working system. 

My point is: do you have to call that "de-identification"? What if I would suggest to call it "pseudo-tracking". You would be shocked. :)

The thing you do is right. The label is wrong, utterly wrong. Because it suggests that yellow is actually green. Name it "enhanced pseudonymization" and get enhanced permissions in a permitted use. That would be fine IMHO.

But you can't say: Yellow is off the hook, a plain permitted use. 
Because you still single out in the analytics. If you don't want to single out, create bigger buckets. But you don't suggest to create those bigger buckets. And that's not off the hook, not green and still somewhat "identified" ; not "de-identified". The target of the analysis is still a single person. The name/address/birthday triple starts to get almost irrelevant for identity. 

BTW, if you can single out in the profile, you can get back to the user by watching traffic for the identified behavioral pattern (fingerprint). 
But you don't want to link back to the user, you want to predict your next user. And the user doesn't want to be predictable, thus DNT:1. If I want you to predict me, I'll give you a DNT:0

 --Rigo

On Wednesday 10 July 2013 07:47:28 Shane Wiley wrote:
> The element that is missing in your analysis is the "operational 
> nature" of the resulting identifier.  Pseudonyms can still be used in 
> a production setting - meaning that I can alter a user's experience in 
> real-time leveraging historical activity associated with the 
> pseudonym.  In our proposal, the result of de-identification does NOT 
> allow for this and data's only utility is for analytical purposes 
> alone.  Repeat, data in the Yellow Zone CANNOT be used to link back to 
> the real user in anyway.
> 
> I believe this is the significant disconnect in attempting to leverage 
> pseudonyms in the context of the DNT standard as they would clearly 
> live in the Red zone - NOT the Yellow zone.

Received on Wednesday, 10 July 2013 10:48:12 UTC