RE: [TMO] patient record normalization

* Michel_Dumontier <Michel_Dumontier@carleton.ca> [2010-09-11 11:31-0400]
> Hi Lee!
> 
> > -----Original Message-----
> > From: Lee Feigenbaum [mailto:figtree@gmail.com] On Behalf Of Lee
> > Feigenbaum
> > Sent: Saturday, September 11, 2010 2:54 AM
> > To: Michel_Dumontier
> > Cc: Eric Prud'hommeaux; Chimezie Ogbuji; public-semweb-lifesci@w3.org
> > Subject: Re: [TMO] patient record normalization
> > 
> > On 9/11/2010 2:04 AM, Michel_Dumontier wrote:
> > >>> It's not a restriction on the predicates - it's a restriction on
> > >> instances of a certain class - like that of blood pressure
> > >> measurements. Checking consistency would tell you whether your data
> > >> conforms to the specification described by the ontology document.
> > >>
> > >> Right, but tells whom, and when? including :measuredInUnits
> > advertises
> > >> a flexibility which you do not intend to honor.
> > >
> > > The predicate would only advertise that the domain would be a
> > quantity and the range a unit.
> > 
> > Speaking as someone just browsing this discussion (so take my comments
> > for what they're worth, which isn't much), I'd tend to agree with Eric
> > here. If I (as a human) saw this in an ontology, I'd expect that I can
> > freely mix and match units in my data and that any software processing
> > the data will cope with it or raise a reasonable error.
> 
> You are most certainly able use a variety of units - but if an ontology specifies the unit, a a dataset imports the ontology, then a valid dataset would conform to this specification.

As several have pointed out, that "dataset imports the ontology"
implies an OWL reasoner. This imposition on authoring or consumption
is an expensive presumption and likely to lead to non-conformant data,
which will impare our ability to query in the bazaar.

> > >> If I dereference
> > >> :systolicMPa, I learn that the units are exactly MPa. If I
> > dereference
> > >> muo:numericalValue and muo:measuredUnits, I learn that I can use any
> > >> units (misleading).
> > >
> > > It isn't misleading, it's exactly as advertised.
> > 
> > Would you expect my above assumption to be accurate? It sounded from
> > some other messages in the thread that there's a thought that even with
> > the "generic" approach that systems would in general handle data in
> > homogeneous units?
> 
> The requirement here is that any and all units can be specified using the relation, but an ontology can restrict the number, *kinds* of units, or specific units applicable.

> > >> If I wade through the OWL for TMO, I learn that
> > >> there's a restriction for say:
> > >>
> > >>    Class: tmo:BloodSystolicPressureReading EquivalentTo:
> > >>          (:value exactly 1)
> > >>           and (:measuredInUnits exactly u:mmHg)
> > >>
> > > 		and (:measureInUnits only u:mmHg)
> > >
> > >> which, if I think hard, tells me that I must normalize my data, but
> > >> this is pretty far from follow-your-nose semantics.
> > >
> > > There's no thinking required - the semantics are clearly spelled out
> > in the axioms. Instances of this class refer to mmHg as the unit.  Any
> > instance that refers to a different unit is not a member of this class.
> > 
> > There's no thinking required if you have an OWL reasoner as an integral
> > part of your tool chain. 
> 
> I think, given that the TMO *is* an OWL2 ontology, that use of the toolchain *is* a requirement.

I don't see any benefit to imposing that requirement on the use of
what we'd like to be an adopted ontology. We can describe it in OWL,
but to require OWL to use it will alienate most of the world.

> > Otherwise, there is thinking required. And
> > even
> > if you have an OWL reasoner in your tool chain, you'd probably have to
> > be doing something clever with integrity constraints a la Clark &
> > Parsia
> > to catch errors this way, rather than just to end up asserting bogus
> > data.
> 
> No, I don't believe that is the case.
> 
> m.

Regardless, you'd have to have it and you'd have to be motivated use it.

> > Again, apologies if my comments are off-base as I'm mainly just passing
> > through here!
> > 
> > Lee
> > 
> > >> I think I have described why authoring is less fault-prone if the
> > >> normalized date in TMO uses precise predicates. Do you have other
> > use
> > >> cases which override that one?

Let's keep the concrete propositions around so we can test these
theses:

single-unit predicate:
:X trans:bloodPressure
  [ trans:systolicMPa 120 ;
    trans:diastolicMPa 80 ] .

generic-unit predicate:
:X trans:bloodPressure
  [ trans:systolic [ muo:measuredIn trans1:MPa ; muo:numericalValue 120 ] ;
    trans:diastolic [ muo:measuredIn trans1:MPa ; muo:numericalValue 80 ] ] .


> > > The counter argument to using a specialized predicate is that
> > > 1) we cannot describe a unit

I'm not sure what the use case is, but we can say that the set of
things with a trans:systolicMPa->X is equivalent to the set of things
with muo:measuredIn->trans1:MPa, muo:numericalValue X . I don't think
generic-unit predicates buy us any more than that.

> > > 2) there is a proliferation of relations as there are countless
> > quantities multiplied by each of their respective units.

I see 3 in either case.

> > > 3) relations can only be weakly described (they do not have the class
> > constructors available to describe them)

Sorry, I don't follow this one. Can you describe in terms of the
proposed vocabulary?

> > > 4) requires one to query the labels instead of the semantics to find
> > the appropriate relation.

Can you give an example here as well?

> > > 5) requires one to parse the label for the intended unit.

I'm not sure the practicality of querying for everything in the
database which is in MPa, but if you're motivated to do inference,
it's in the OWL.

> > > It's a shortcut that makes linked data prettier, but weakens formal
> > knowledge representation.
> > 
> > 
> >  > m.
> > >
> > >
> > >

-- 
-ericP

Received on Saturday, 11 September 2010 21:13:18 UTC