W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > October 2011

Poster child for well-defined data sharing practices

From: M. Scott Marshall <mscottmarshall@gmail.com>
Date: Mon, 10 Oct 2011 15:55:09 +0200
Message-ID: <CACHzV2Nt7Ua6yP5GyPMmayUUED-xY4vYCnPRxi1486kBQ-Nw9w@mail.gmail.com>
To: HCLS <public-semweb-lifesci@w3.org>
Many have probably already heard about this debacle in cancer research,
where scientists proceeded too quickly from publication to three clinical
trials. It's a poster child for data sharing. Of course, there are several
social / political factors involved but transparency and clear markup of the
data could have prevented at least some of the following problems. With
accessible data behind a SPARQL endpoint, colleagues would have been able
share their data without introducing errors and misinterpretations. External
reviewers would have been able to more easily verify claims.

"For example, they saw that in one of their papers Dr Potti and his
colleagues had mislabelled the cell lines they used to derive their
chemotherapy prediction model, describing those that were sensitive as
resistant, and *vice versa*."

"Another alleged error the researchers at the Anderson centre discovered was
a mismatch in a table that compared genes to gene-expression data. The list
of genes was shifted with respect to the expression data, so that the one
did not correspond with the other."

"The review committee, however, had access only to material supplied by the
researchers themselves, and was not presented with either the NCIís exact
concerns or the problems discovered by the team at the Anderson centre."

"He noted that in addition to a lack of unfettered access to the computer
code and consistent raw data on which the work was based, journals that had
readily published Dr Pottiís papers were reluctant to publish his letters
critical of the work."

Here's the article:
http://www.economist.com/node/21528593

Of course, a system for attribution (e.g. data contribution) and clear terms
of data use attached to the data would also help correct the above scenario.

Cheers,
Scott

P.S. There was a NY Times article about the same problem a few months ago
that went too far in its conclusions, essentially calling personalized
medicine into question (!) on the basis of the above events.

-- 
M. Scott Marshall
http://staff.science.uva.nl/~marshall
Received on Monday, 10 October 2011 13:55:37 GMT

This archive was generated by hypermail 2.3.1 : Tuesday, 26 March 2013 18:01:03 GMT