W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > February 2012

Re: how SHRINE appeased IRBs (HIPAA rules)

From: David Booth <david@dbooth.org>
Date: Mon, 06 Feb 2012 10:56:27 -0500
To: Eric Prud'hommeaux <eric@w3.org>
Cc: public-semweb-lifesci@w3.org
Message-ID: <1328543787.2250.61290.camel@dbooth-laptop>
This is an excellent illustration of how damaging HIPAA is to research
efforts, and how important it will be to develop *standard* ways to
conform to HIPAA requirements and make research data more usable.  

For example, instead of every hospital or research project creating its
own custom consent form (for authorizing the collection of personal
health data), we need to *standardize* these consent forms in exactly
the way that Creative Commons began standardizing licenses, so that data
can be used more broadly and tools can automatically determine what data
can be used for what purposes.  John Wilbanks is working on this with
the Consent to Research project:


On Mon, 2012-02-06 at 10:10 -0500, Eric Prud'hommeaux wrote:
> it's challenging to get IRB approval to poke a patient data; here's how I2B2's SHRINE project managed to get IRB approval on their early demo:
> [[
>     1 There would be no central database. Each hospital would own and manage its data locally and have a local principal investigator responsible for the database.
>     2 The prototype would only be available for a limited time, after which all data would be destroyed.
>     3 The local databases at each hospital would include only old data from 2006. After a one-time load, the data would not be refreshed.
>     4 All patients whose data would be used in the prototype received a HIPAA privacy notice that allows their personal health information to be used for research that has been reviewed and approved by an IRB.
>     5 The prototype would only allow queries that return aggregate counts of clinical data, such as the total number of patients with diabetes at each health center. No identified data or data collected as part of a research study would be included in this demo.
>     6 The prototype would obfuscate the aggregate counts by adding a small random number. Thus, the user would see an approximate count of the number of matching patients, not the exact count.9 To make it more difficult for the user to guess the actual number, the prototype would “lock” the user's account if the same query was run multiple times in the same day.
>     7 If a hospital returned less than ten patients in a query, then “less than 10” would be presented rather than the actual count.
>     8 An audit of all queries would be logged.
>     9 In addition to an overall principal investigator for the SHRINE prototype, each hospital would have a local PI who would be responsible for his or her hospital's patient data.
> and
>     1 Individual hospitals could remove their databases from the prototype at any time.
>     2 Hospitals would not be identified by name in the demo. Instead, the labels “hospital 1”, “hospital 2”, “hospital 3” would be used.
>     3 For each query, the aggregate counts would be displayed in a random order so that “hospital 1”, for example, would refer to a different institution each time.
>     4 The aggregate counts would be multiplied by a scale-factor that is inversely proportional to the number of patients at the hospital. Otherwise, PHS, which includes both BWH and MGH would return aggregate counts that were roughly twice as big on average as the other two hospitals.
>     5 The counts from the three health centers would be displayed simultaneously instead of one at a time in the order in which they are returned by the hospitals. Otherwise, the speed of a local hospital's database, which is dependent on many factors such as the amount of data and types of servers, could be used to identify the health center from which an aggregate count came.
> ]] — <http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2744712/?tool=pubmed>
> pretty much all of that could apply to our federation demos. (or we could write SPARQL wrappers around SHRINE query endpoints.)

David Booth, Ph.D.

Opinions expressed herein are those of the author and do not necessarily
reflect those of his employer.
Received on Monday, 6 February 2012 16:03:34 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:52:51 UTC