W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > February 2013

Re: From strings to things: ClinicalTrials.gov

From: Alan Ruttenberg <alanruttenberg@gmail.com>
Date: Sun, 17 Feb 2013 13:33:15 -0500
Message-ID: <CAFKQJ8=S+cS-nVZEi=sk__0HdWZ=dAziX0gWEG2YB9j0nSQeBQ@mail.gmail.com>
To: Kerstin Forsberg <kerstin.l.forsberg@gmail.com>
Cc: public-semweb-lifesci@w3.org, em@zepheira.com, cdsouthan@hotmail.com, brendan.kelleher@karmadata.com, Adam.Jacobs@dianthus.co.uk
Have you looked at https://www.clinicaltrialsregister.eu as well? We
did a little work in our group regarding eligibility criteria, and had
the impression that the curation was often substantially more detailed
at the EU site.

In our limited review, both sites, however, suffered from the
inability to share the original protocols, even redacted,
incompleteness in terms of the criteria recorded as compared to
protocol documents, and errors introduced by OCR, such as 10^9 being
translated to 109.

The licensing and sharing for these resources are also somewhat
borked. It's very unclear what can be republished from
clinicaltrials.gov, and on top of this we have the licenses added
(almost certainly without standing, but adding to the confusion) such
as the noncommercial sharealike license added by linkedct.org.

I think you are right that the publication of such resources for
consumption on the semantic web should be supported by the federal
institutions, and in limited cases there is some support. However I
think it may be difficult if they can't negotiate the licenses so that
users can clearly use (republish, create derivative works, etc) what
they publish in innovative (computational ways) and so that they can
release more of the primary materials so that we can have
crowd-sourced improvements to the content of the published records.


On Sat, Feb 16, 2013 at 7:58 AM, Kerstin Forsberg
<kerstin.l.forsberg@gmail.com> wrote:
> Hi,
> a couple of tweets, blog post comments 1) and email exchanges during the
> week on moving ClinicalTrials.gov "from strings to things" made me think
> this could be a topic for discussion at the upcoming CSHALS. As I'll not be
> able to be there in person I'm using this email list to hear your thoughts.
> Background:
> We see many nice examples of curated/standardized feeds of CT.gov data, such
> as http://linkedct.org, http://www.patientslikeme.com/clinical_trials and
> http://www.clinicalcollections.org/trials/ etc.. Most of them do a good job
> in turning “strings into things” and a few of them apply the Linked Data
> principles. However, I don’t think any of them use http-based URIs to
> identify things such as sponsor organization, clinical sites, clinical
> investigators, geography, disease, drug, and time.
> I argue that we as a community caring for clinical trials data should push
> back to FDA and NLM to get an official, standardized, linked data interface
> directly to the CT.gov at source. And yes, also for FDA and NLM to push back
> to pharma companies to provide standardized data about our trials with URIs
> to identify things instead of all these text strings. And also if pharma
> company websites such as http://www.gsk-clinicalstudyregister.com/ and
> http://www.astrazenecaclinicaltrials.com/ did the same.
> Given the current movement for clinical trial data transparency 2) I may
> think the timing is good. But, potentially challenging both for FDA, NLM and
> for the pharma companies. They (we) will all look for practical advice on
> what URIs to use for things such as drugs and organizations.
> Thoughts?
> Kerstin
> 1)
> http://blog.karmadata.com/2013/02/11/loading-clinical-trials-data-in-ten-minutes-flat/comment-page-1/#comment-20
> 2)
> http://www.placebocontrol.com/2013/02/our-new-glass-house-gsks-commitment-to.html
Received on Sunday, 17 February 2013 18:34:13 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:53:00 UTC