- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Fri, 10 Sep 2010 12:53:22 -0400
- To: Michel_Dumontier <Michel_Dumontier@carleton.ca>
- Cc: "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
* Michel_Dumontier <Michel_Dumontier@carleton.ca> [2010-09-09 14:52-0400] > Hi, > I think the model where we separate the unit from the value is preferred (as per CPR) because it is highly flexible. You are right however that people could then refer to any unit, although an ontology could specify the unit involved (universal restriction). The alternative, which Bijan worked on was a stylesheet to convert values in different units to a common unit. At W3, standardization includes detecting and eliminating redundant flexibility. If someone says "<img src='X'/> == <img href='X'/>", we say "pick exactly one or there will be bugs and inefficiency". To that end, I'd like the TMO task force to have exactly format for the tests worth standardizing, e.g. blood pressure. Further, I'd like users of the TMO to benefit from this stake in the ground; specificaly, I don't want them to query data that's half in MPa and half in mmHg. Voila my desire for one inflexible representation. This draconian measure could be enforced by OWL restrictions, if folks chose to run consistency checkers on their data, but some folks won't, and some folks won't in time, and some folks will resent us for imposing on their pipeline, and some folks won't even descry this imposition. Normalization can also be enforced in the choice of predicate; we can say that the object of cpr:systolicBpMPa¹ is in MPa. We can write this down in the schema, and also as an OWL restriction. This moves the burden of inference from users of the standard to those who are mixing with data which has other units (a shrinking group when standardization is successful). I believe the principle counter argument to normalization is that this would be an obstacle to adoption; that e.g. clinics or pharmas who would otherwise be tempted to express their clinical data in CPR would be discouraged by the requirement of input normalization. I think that group is vanishingly small, especially if they face heterogeneous data and have to normalize anyways. It's possible that the arguments for homogeneous data (no query/inference-time normalization, trivial federation, etc.) are too subtle to persuade the above group, but I think the clinical web will be much better off if we can eliminate redundant flexibility. ¹ Chimezie, what do you think of this imposition on CPR? > m. > > > -----Original Message----- > > From: public-semweb-lifesci-request@w3.org [mailto:public-semweb- > > lifesci-request@w3.org] On Behalf Of Eric Prud'hommeaux > > Sent: Thursday, September 09, 2010 1:06 PM > > To: public-semweb-lifesci@w3.org > > Subject: [TMO] patient record normalization > > > > We have choices about how to model units. per the first TMO RDF > > patient data, we can keep the units as datatypes: > > > > :X trans:bloodPressure > > [ trans:systolic "120"^^u:mmHg ; > > trans:diastolic "80"^^u:mmHg ] . > > > > per CPR, as a pair of value and datatype: > > > > … [ trans:systolic [ muo:measuredIn trans1:mmHg ; muo:numericalValue > > "120" ] ; > > trans:diastolic [ muo:measuredIn trans1:mmHg ; muo:numericalValue > > "80" ] ] . > > > > Another, potentially more attractive option, is to model units in the > > predicate: > > > > :X trans:bloodPressure > > [ trans:systolicMmHg "120" ; > > trans:diastolicMmHg "80" ] . > > > > This greatly simplifies our life as we are otherwise likely to have a > > variety of e.g. BP data in the database: 120/80 mmHg, 12/8 DmHg, > > 16000/10667 Pa, > > 16/11 MPa, 13 (PAM) > > > > which would lead to rediculous queries when we want to use the data: > > > > SELET ?sysM ?diaM { > > ?x trans:bloodPressure [ trans:systolic ?sys ; > > trans:diastolic ] > > FILTER (datatype(?sys) = u:mmHg) && datatype(?dia) = u:mmHg) > > } > > UNION SELECT (?sys*10 as ?sysM) (?dia*10 as ?diaM) { > > ?x trans:bloodPressure [ trans:systolic ?sys ; > > trans:diastolic ] > > FILTER (datatype(?sys) = u:dmHg) && datatype(?dia) = u:dmHg) > > } > > UNION SELECT (?sys*133 as ?sysM) (?dia*133 as ?diaM) { > > ?x trans:bloodPressure [ trans:systolic ?sys ; > > trans:diastolic ] > > FILTER (datatype(?sys) = u:MPa) && datatype(?dia) = u:MPA) } > > … } > > > > > > > > -- > > -ericP > -- -ericP
Received on Friday, 10 September 2010 16:54:05 UTC