W3C home > Mailing lists > Public > public-tracking@w3.org > January 2013

Re: TPWG agenda for Wednesday, January 16; background reading on de-identification

From: Joseph Lorenzo Hall <joe@cdt.org>
Date: Fri, 11 Jan 2013 12:26:30 -0500
Message-ID: <50F04B46.5040707@cdt.org>
To: Peter Swire <peter@peterswire.net>
CC: Deven McGraw <deven@cdt.org>, "public-tracking@w3.org" <public-tracking@w3.org>
Hi Peter!

I read the UK ICO paper recently -- which in no way means I'm prepared 
to brief the group! The ICO paper is pretty broad and focuses on 
publishing anonymized data.

One of the big differences I found, as a techie, was in the appendices 
of the ICO report; there they discuss more and more technical de-ID 
strategies compared to the HHS' document's focus on suppression 
(redaction), generalization (abbreviation, aggregation) and 
perturbation.  For example, the ICO document talks about resampling, 
swapping, etc.

I also found its' more broad treatment of the ethics of publishing data 
at all as well as a suggestion to do "pen-testing" like 
re-identification exercises to be interesting (there's a flow chart on 
p. 37 that incorporates the guidance from the document rather well). 
best, Joe

On Fri Jan 11 11:48:46 2013, Peter Swire wrote:
> Hello DNT folks:
>
>
>
> In response to a question, yes there will be the usualWorking Group
> call on Wednesday, January 16.
>
>
>
> The call will include a presentation on the de-identification
> guidelines issued by the U.S. Department of Health and Human Services
> in November, 2012.  Deven McGraw of CDT was deeply involved in that
> process, and has agreed to present on that subject.
>
>
>
> Another major 2012 document on de-identification was areport of the UK
> Information Commissioner Office, with guidelines for anonymisation
> under UK and EU law.  Is there someone in the group, or known to the
> group, who has materials prepared on these guidelines and would be
> able to brief the group on them?  If someone is able to do that for
> this Wednesday, we could do roughly half the call on each one.
>
>
>
> Discussion below on why these documents provide good background for
> our discussion of delinking/de-identification.
>
>
>
> Best,
>
>
>
> Peter
>
> ======
>
>
> Background reading on de-identification:
>
>
>             (1) United Kingdom, Information Commissioner’s Office,
> “Anonymisation: Managing Data Protection Risk Code of Practice.”
> (2012).  This is the first code of practice on anonymisation published
> by an EU data protection authority.
>
>
>
> http://www.ico.gov.uk/for_organisations/data_protection/topic_guides/~/media/documents/library/Data_Protection/Practical_application/anonymisation_code.ashx
>
>
>
>             (2) U.S. Department of Health and Human Services,
> “Guidance Regarding Methods of De-Identification of Protected Health
> Information in Accordance with the HIPAA Privacy Rule.” (2012).
>
>
>
> http://www.hhs.gov/ocr/privacy/hipaa/understanding/coveredentities/De-identification/hhs_deid_guidance.pdf
>
>
>
>
>
>             Here is an explanation for why I have selected these two
> documents to assist in our examination of de-identification issues.
> Both of them are written by established government agencies that have
> years of experience with de-identification.  Both agencies sought and
> received public comments in the preparation of the reports, from a
> range of stakeholders.
>
>
>
>             Selection of these documents is not intended to endorse
> the reports or claim that their recommendations should be applied
> directly to Do Not Track.  For the HHS report, one might assert that
> it is stricter than should apply to DNT, because medical data is
> usually considered more sensitive than advertising data.  On the other
> hand, perhaps the HHS report is less strict than appropriate for DNT,
> because entities covered by the HIPAArules have comprehensive privacy
> obligations that do not apply to other U.S. firms.  Similarly, for the
> ICO report, one might argue that it is stricter than appropriate for
> DNT, because many entities covered by DNT are not subject to the
> comprehensive legal regime of the EU Data Protection Directive.  By
> contrast, one might argue that the ICO report is not as strict as
> appropriate. I have been told, for instance, that the Dutch approach
> is stricter than the ICO report, although I have not seen any document
> that explains the Dutch approach.  If someone in the Working Group is
> aware of such a document, that could be helpful.
>
>
>
>             Here are two other governmental reports that provide
> additional background for those who wish to dig deeper:
>
>
>
>             1.  Health System Use Technical Advisory Committee, “Best
> Practice Guidelines for Managing the Disclosure of De-Identified
> Health Information.”  2010.  This document was drafted by a
> multi-stakeholder group led by Canadian federal/provincial/territorial
> ministries of health.
>
>
>
> http://www.ehealthinformation.ca/documents/de-idguidelines.pdf
>
>
>
>             2.  Federal Committee on Statistical Methodology,
> “Statistical Policy Working Paper 22, Report on Statistical Disclosure
> Limitation Methodology.”  2005.  The U.S. government for decades has
> released statistical information while seeking to prevent
> re-identification, such as for Census results.  This paper is the
> current inter-agency policy document for how to manage the risks of
> re-identification.
>
>
>
> http://www.fcsm.gov/working-papers/SPWP22_rev.pdf
>
>
>
>             I welcome others on the WG to suggest background reading
> on delinking/de-identification, as we lead up to face-to-face
> discussion on the topic in Boston in February.
>
>
>
>             Peter
>
>
>
>
>
>
> Professor Peter P. Swire
> C. William O'Neill Professor of Law
>     Ohio State University
> 240.994.4142
> www.peterswire.net

--
Joseph Lorenzo Hall
Senior Staff Technologist
Center for Democracy & Technology
1634 I ST NW STE 1100
Washington DC 20006-4011
(p) 202-407-8825
(f) 202-637-0968
joe@cdt.org
PGP: https://josephhall.org/gpg-key
Received on Friday, 11 January 2013 17:26:59 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:39:18 UTC