W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > February 2008

SDTM mapping Pt1 data - [COI] Clinical Observations Interoperability

From: Rachel Richesson <Rachel.Richesson@epi.usf.edu>
Date: Sat, 23 Feb 2008 22:13:36 -0500
Message-ID: <518E964FB13BF249AFFD4965D8895CC50FFD47@ex2.epi.usf.edu>
To: <public-hcls-coi@w3.org>
Cc: <public-semweb-lifesci@w3.org>
A first attempt at mapping the free text narrative mock data into the SDTM model is posted at: http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability?action=AttachFile&do=get&target=SDTM-example-Patient1.xls <http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability?action=AttachFile&do=get&target=SDTM-example-Patient1.xls> 


A big thanks to Jennifer Fostel for doing the bulk of this work.


Caveat -- This CDISC (SDTM) model is for FDA reporting of data from a completed clinical trial - with planned visits, exposures, etc.  Therefore we are not using it as intended. I can't think of any logical reason that narrative data such as this would ever be put into SDTM, but I do hope that this exercise will give everyone a flavor of the model. I give a little background on the model below, and then make some suggestions for our next discussion.


Background and explanation of posting:

The SDTM standard specifies a series of data tables by domains (e.g., lab test, adverse events, physical exam, demographics). The specification includes a list of data fields and value sets (~"controlled vocabulary") for each domain. The controlled vocabulary (determined by CDISC teams) for each domain field is maintained by the NCI's EVS system. 


Controlled vocabulary can be downloaded from 

http://www.cancer.gov/cancertopics/terminologyresources/page6 <http://www.cancer.gov/cancertopics/terminologyresources/page6> 

for the following domains: ECG Test Results (EG), Concomitant Medications (CM), Exposure (EX) and Substance Use (SU) and some Labtests (LB). Controlled vocabulary for Adverse Events (AE), Laboratory Test Results (LB), Physical Examinations (PE), Subject Characteristics (SC) and Vital Signs (VS) domains is still under CDISC review and available at: SDTM Package 2B & Labtest Package 2 <http://cdisc.org/downloads/SDTMPackage2B_LabtestPackage2.zip> 


The attached spreadsheet pulls key concepts from the Patient 1 narrative, organizes them by domain, and maps them to data fields (in yellow). The controlled terminology for some data fields that I could verify with CDISC vocabulary postings are in green. Since much of the terminology is not finalized yet, and because I got lazy, there are only a few in green, but you can see that the format of each preferred term is an 8-character label. Collectively, the data would be in a table or spreadsheet, one record per observation. If we are going to use this data for the demo somehow, then I would recommend that I run this by some others who are more familiar with the SDTM, but I think that this certainly illustrates the spirit of the model.


Discussion for COI group:

I hope that we can look at this to get an understanding of the model, and think of next logical steps. CDISC is a standards organization which has several models. While SDTM is most developed, I am not sure it is useful in the scenario of our COI demonstration. There are other models, such as the Protocol Representation model [sponsored by HL7 RCRIM TC with representation from CDISC], that seem more appropriate to me. Although the PR model is not complete (last I checked the Eligibility Criteria are all one free text field), to me it seems the most logical place for Eligibility Criteria to be represented. Rather than try to push our data into a current ill-suited CDISC model, I would suggest we consider a more direct representation of Eligibility Criteria for now, and perhaps results of our work help define the eligibility portion of the PR model in the future.


There have been several pointers on our threads to good works on Eligibility Criteria and rule representation that should be reviewed. Additionally, both Vipul's requirements documents and my early work on the use case clearly show that Eligibility Criteria fall into several vocabulary domains (medications, diagnoses, procedures, etc.) Perhaps, to simplify this project for demonstration purposes, we could consider using these domains + vocabulary, and not a formal information model, for the Eligibility Criteria?


I hear from the calls that there is a strong desire to map between CDISC and HL7 models. I admit that I do not have a complete understanding of our strategy for this demonstration, but I would like to see some discussion of the high-level steps that ensure that all of our efforts are directly related to the use case. As I understand it, ultimately there will be a matching of Eligibility Criteria from research protocols to clinical patient data. I am not sure that CDISC's SDTM model practically fits into this scenario.  If it does, then it would seem that we would map the DCM elements directly to the SDTM, as I believe is specified in step 6.2 of the Project Plan at: http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability/ProjectPlan.html <http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability/ProjectPlan.html> 


Also, Helen Chen's proposed ontology for Eligibility Criteria shows promise (rather than SDTM??) to simplify our task: http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability?action=AttachFile&do=get&target=CTEligibility.owl <http://esw.w3.org/topic/HCLS/ClinicalObservationsInteroperability?action=AttachFile&do=get&target=CTEligibility.owl> 


Certainly, there are others on this listserve more familiar with the problem space than I am so I look forward to hearing other opinions. I also look forward to some good discussion on the next call, and would suggest that we all review the use case and project plan before then to help ground our discussion.




Rachel Richesson, PhD, MPH
Assistant Professor
USF College of Medicine, Department of Pediatrics
3650 Spectrum Blvd., Suite 100
Tampa, FL 33612
Office: (813) 396-9522
Fax: (813) 396-9601
Email: richesrl@epi.usf.edu
Received on Sunday, 24 February 2008 03:13:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:20:33 UTC