RE: [TMO] patient record normalization from Michel_Dumontier on 2010-09-11 (public-semweb-lifesci@w3.org from September 2010)

From: Michel_Dumontier <Michel_Dumontier@carleton.ca>
Date: Sat, 11 Sep 2010 11:31:55 -0400
To: Lee Feigenbaum <lee@thefigtrees.net>, Michel_Dumontier <Michel_Dumontier@carleton.ca>
CC: "Eric Prud'hommeaux" <eric@w3.org>, Chimezie Ogbuji <ogbujic@ccf.org>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
Message-ID: <E1784B0107E5634C8997868083EDE7805EB8115A1B@CCSMBX10.CUNET.CARLETON.CA>

Hi Lee!

> -----Original Message-----
> From: Lee Feigenbaum [mailto:figtree@gmail.com] On Behalf Of Lee
> Feigenbaum
> Sent: Saturday, September 11, 2010 2:54 AM
> To: Michel_Dumontier
> Cc: Eric Prud'hommeaux; Chimezie Ogbuji; public-semweb-lifesci@w3.org
> Subject: Re: [TMO] patient record normalization
> 
> On 9/11/2010 2:04 AM, Michel_Dumontier wrote:
> >>> It's not a restriction on the predicates - it's a restriction on
> >> instances of a certain class - like that of blood pressure
> >> measurements. Checking consistency would tell you whether your data
> >> conforms to the specification described by the ontology document.
> >>
> >> Right, but tells whom, and when? including :measuredInUnits
> advertises
> >> a flexibility which you do not intend to honor.
> >
> > The predicate would only advertise that the domain would be a
> quantity and the range a unit.
> 
> Speaking as someone just browsing this discussion (so take my comments
> for what they're worth, which isn't much), I'd tend to agree with Eric
> here. If I (as a human) saw this in an ontology, I'd expect that I can
> freely mix and match units in my data and that any software processing
> the data will cope with it or raise a reasonable error.

You are most certainly able use a variety of units - but if an ontology specifies the unit, a a dataset imports the ontology, then a valid dataset would conform to this specification.

 
> >> If I dereference
> >> :systolicMPa, I learn that the units are exactly MPa. If I
> dereference
> >> muo:numericalValue and muo:measuredUnits, I learn that I can use any
> >> units (misleading).
> >
> > It isn't misleading, it's exactly as advertised.
> 
> Would you expect my above assumption to be accurate? It sounded from
> some other messages in the thread that there's a thought that even with
> the "generic" approach that systems would in general handle data in
> homogeneous units?

The requirement here is that any and all units can be specified using the relation, but an ontology can restrict the number, *kinds* of units, or specific units applicable.

> >> If I wade through the OWL for TMO, I learn that
> >> there's a restriction for say:
> >>
> >>    Class: tmo:BloodSystolicPressureReading EquivalentTo:
> >>          (:value exactly 1)
> >>           and (:measuredInUnits exactly u:mmHg)
> >>
> > 		and (:measureInUnits only u:mmHg)
> >
> >> which, if I think hard, tells me that I must normalize my data, but
> >> this is pretty far from follow-your-nose semantics.
> >
> > There's no thinking required - the semantics are clearly spelled out
> in the axioms. Instances of this class refer to mmHg as the unit.  Any
> instance that refers to a different unit is not a member of this class.
> 
> There's no thinking required if you have an OWL reasoner as an integral
> part of your tool chain. 

I think, given that the TMO *is* an OWL2 ontology, that use of the toolchain *is* a requirement.

> Otherwise, there is thinking required. And
> even
> if you have an OWL reasoner in your tool chain, you'd probably have to
> be doing something clever with integrity constraints a la Clark &
> Parsia
> to catch errors this way, rather than just to end up asserting bogus
> data.

No, I don't believe that is the case.

m.

> Again, apologies if my comments are off-base as I'm mainly just passing
> through here!
> 
> Lee
> 
> >> I think I have described why authoring is less fault-prone if the
> >> normalized date in TMO uses precise predicates. Do you have other
> use
> >> cases which override that one?
> >
> > The counter argument to using a specialized predicate is that
> > 1) we cannot describe a unit
> > 2) there is a proliferation of relations as there are countless
> quantities multiplied by each of their respective units.
> > 3) relations can only be weakly described (they do not have the class
> constructors available to describe them)
> > 4) requires one to query the labels instead of the semantics to find
> the appropriate relation.
> > 5) requires one to parse the label for the intended unit.
> >
> > It's a shortcut that makes linked data prettier, but weakens formal
> knowledge representation.
> 
> 
>  > m.
> >
> >
> >

Received on Saturday, 11 September 2010 15:32:27 UTC