- From: Shane Wiley <wileys@yahoo-inc.com>
- Date: Sat, 11 May 2013 21:02:04 +0000
- To: Walter van Holst <walter.van.holst@xs4all.nl>
- CC: Kevin Kiley <kevin.kiley@3pmobile.com>, "public-tracking@w3.org" <public-tracking@w3.org>, Brad Kulick <kulick@yahoo-inc.com>
Walter, Much of this was discussed in the room: Phase 1: Raw Data Permitted Uses: Security/Fraud, Frequency Capping, Debugging, (some) Financial/Audit <Areas where an operational ID is needed. The goal here is to define shorter retention timeframes where possible> Phase 2: De-Identified Permitted Uses: Financial/Audit, Product Improvement, Market Research <One-way secret hash, operational/administrative controls. Data is not able to be used for production purposes - unable to alter a specific user's online experience.> Phase 3: Unlinked Data: Any use <Re-one-way secret, further data minimization and/or aggregation, key is destroyed> - Shane -----Original Message----- From: Walter van Holst [mailto:walter.van.holst@xs4all.nl] Sent: Saturday, May 11, 2013 1:56 PM To: Shane Wiley Cc: Kevin Kiley; public-tracking@w3.org; Brad Kulick Subject: RE: Proposal from Big Basin break out On 2013-05-11 22:49, Shane Wiley wrote: > Kevin, > > While the tri-state de-identification scheme does not dictate > specific IP Address replacement guiderails, I believe the "reasonable" > tenant is the one to focus on here. For example, if IP Address is > replaced with Postal Code (5 digit, not 9 digit) then I believe most > record sets would continue to be deemed de-identified. But let's say > another team is looking only a hyper location of data subset and the > record set contains only the de-identified ID (separate key from other > systems) and the lat/long for that ID. With only these data points, a > team can look at the frequency of events and geo-spacial clusters > overtime, but would not have the means to reverse identify the data > set as no side facts/data exist. It's this type of balance that is > difficult to prescriptively outline upfront and why standards focus on > principles and allow innovation to occur within those boundaries. Dear Shane, Before we go deeply into the details, I personally believe that the hashings both at the beginning and at the end of the de-identification process are much more important than any postal codes (even the four-digit two-character ones of my country of origin). What kind of hashes would be part of the proposal? Moreover, I feel that the proposed scheme is lacking in any prescriptive power for the permitted uses. For the permitted uses I would feel much more comfortable with some guidance on both pseudonymisation and de-identification. The latter is easily achieved if we get to a consensus on de-identification in general. Regards, Walter
Received on Saturday, 11 May 2013 21:02:44 UTC