RE: [TMO] patient record normalization

> -----Original Message-----
> From: Eric Prud'hommeaux [mailto:ericw3c@gmail.com] On Behalf Of Eric
> Prud'hommeaux
> Sent: Saturday, September 11, 2010 5:13 PM
> To: Michel_Dumontier
> Cc: Lee Feigenbaum; Chimezie Ogbuji; public-semweb-lifesci@w3.org
> Subject: RE: [TMO] patient record normalization
> 
> * Michel_Dumontier <Michel_Dumontier@carleton.ca> [2010-09-11 11:31-
> 0400]
> > Hi Lee!
> >
> > > -----Original Message-----
> > > From: Lee Feigenbaum [mailto:figtree@gmail.com] On Behalf Of Lee
> > > Feigenbaum
> > > Sent: Saturday, September 11, 2010 2:54 AM
> > > To: Michel_Dumontier
> > > Cc: Eric Prud'hommeaux; Chimezie Ogbuji; public-semweb-
> lifesci@w3.org
> > > Subject: Re: [TMO] patient record normalization
> > >
> > > On 9/11/2010 2:04 AM, Michel_Dumontier wrote:
> > > >>> It's not a restriction on the predicates - it's a restriction
> on
> > > >> instances of a certain class - like that of blood pressure
> > > >> measurements. Checking consistency would tell you whether your
> data
> > > >> conforms to the specification described by the ontology
> document.
> > > >>
> > > >> Right, but tells whom, and when? including :measuredInUnits
> > > advertises
> > > >> a flexibility which you do not intend to honor.
> > > >
> > > > The predicate would only advertise that the domain would be a
> > > quantity and the range a unit.
> > >
> > > Speaking as someone just browsing this discussion (so take my
> comments
> > > for what they're worth, which isn't much), I'd tend to agree with
> Eric
> > > here. If I (as a human) saw this in an ontology, I'd expect that I
> can
> > > freely mix and match units in my data and that any software
> processing
> > > the data will cope with it or raise a reasonable error.
> >
> > You are most certainly able use a variety of units - but if an
> ontology specifies the unit, a a dataset imports the ontology, then a
> valid dataset would conform to this specification.
> 
> As several have pointed out, that "dataset imports the ontology"
> implies an OWL reasoner. This imposition on authoring or consumption
> is an expensive presumption and likely to lead to non-conformant data,
> which will impare our ability to query in the bazaar.

Non-conforming data will always occur. The point is, that if you want to make it compatible with the data model, you have to follow the spec.  This will occur independent of our specific discussion here about specialized or generic predicates.

 
> > > >> If I dereference
> > > >> :systolicMPa, I learn that the units are exactly MPa. If I
> > > dereference
> > > >> muo:numericalValue and muo:measuredUnits, I learn that I can use
> any
> > > >> units (misleading).
> > > >
> > > > It isn't misleading, it's exactly as advertised.
> > >
> > > Would you expect my above assumption to be accurate? It sounded
> from
> > > some other messages in the thread that there's a thought that even
> with
> > > the "generic" approach that systems would in general handle data in
> > > homogeneous units?
> >
> > The requirement here is that any and all units can be specified using
> the relation, but an ontology can restrict the number, *kinds* of
> units, or specific units applicable.
> 
> > > >> If I wade through the OWL for TMO, I learn that
> > > >> there's a restriction for say:
> > > >>
> > > >>    Class: tmo:BloodSystolicPressureReading EquivalentTo:
> > > >>          (:value exactly 1)
> > > >>           and (:measuredInUnits exactly u:mmHg)
> > > >>
> > > > 		and (:measureInUnits only u:mmHg)
> > > >
> > > >> which, if I think hard, tells me that I must normalize my data,
> but
> > > >> this is pretty far from follow-your-nose semantics.
> > > >
> > > > There's no thinking required - the semantics are clearly spelled
> out
> > > in the axioms. Instances of this class refer to mmHg as the unit.
> Any
> > > instance that refers to a different unit is not a member of this
> class.
> > >
> > > There's no thinking required if you have an OWL reasoner as an
> integral
> > > part of your tool chain.
> >
> > I think, given that the TMO *is* an OWL2 ontology, that use of the
> toolchain *is* a requirement.
> 
> I don't see any benefit to imposing that requirement on the use of
> what we'd like to be an adopted ontology. We can describe it in OWL,
> but to require OWL to use it will alienate most of the world.

Well, this it's kinda like saying - I'm going to make an XML schema, but you can put whatever you want in it in the XML without validating. I'm having a hard time believing that this is your position.

 
> > > Otherwise, there is thinking required. And
> > > even
> > > if you have an OWL reasoner in your tool chain, you'd probably have
> to
> > > be doing something clever with integrity constraints a la Clark &
> > > Parsia
> > > to catch errors this way, rather than just to end up asserting
> bogus
> > > data.
> >
> > No, I don't believe that is the case.
> >
> > m.
> 
> Regardless, you'd have to have it and you'd have to be motivated use
> it.

Integrity constraints? Or the tool chain? 

 
> > > Again, apologies if my comments are off-base as I'm mainly just
> passing
> > > through here!
> > >
> > > Lee
> > >
> > > >> I think I have described why authoring is less fault-prone if
> the
> > > >> normalized date in TMO uses precise predicates. Do you have
> other
> > > use
> > > >> cases which override that one?
> 
> Let's keep the concrete propositions around so we can test these
> theses:
> 
> single-unit predicate:
> :X trans:bloodPressure
>   [ trans:systolicMPa 120 ;
>     trans:diastolicMPa 80 ] .
> 
> generic-unit predicate:
> :X trans:bloodPressure
>   [ trans:systolic [ muo:measuredIn trans1:MPa ; muo:numericalValue 120
> ] ;
>     trans:diastolic [ muo:measuredIn trans1:MPa ; muo:numericalValue 80
> ] ] .

or

:x :has-attribute
  [ a :systolic-blood-pressure;  :has-value 120; :has-unit unit:mPa ]
  [ a :diastolic-blood-pressure; :has-value 80; :has-unit unit:mPa ]

So we have 3 generic predicates; has-attribute, :has-value, :has-unit, and now, as a general design pattern, all we do is specify the kind of measurement value, for which there are thousands. Each of those types can be further described, in terms of the qualities or dispositions they measure, or the material parts they enumerate, or whatever. 

In contrast, the specialized predicate means that for every value in a test panel would require a predicate between the individual and the test value, and then a predicate for each of the components of a test value. 


An ontology that means to specify which unit for use with a given measurement value can do so, by adding the axiom

rdfs:subClassOf :has-unit only unit:mPa;
rdfs:subClassOf :has-value only xsd:int; (or whatever)


> > > > The counter argument to using a specialized predicate is that
> > > > 1) we cannot describe a unit
> 
> I'm not sure what the use case is, but we can say that the set of
> things with a trans:systolicMPa->X is equivalent to the set of things
> with muo:measuredIn->trans1:MPa, muo:numericalValue X . I don't think
> generic-unit predicates buy us any more than that.

Rather, I mean to develop an ontology of units - that a unit is a unit for a certain kind of quality, and how the units are related to one another.

> > > > 2) there is a proliferation of relations as there are countless
> > > quantities multiplied by each of their respective units.
> 
> I see 3 in either case.
> 
> > > > 3) relations can only be weakly described (they do not have the
> class
> > > constructors available to describe them)
> 
> Sorry, I don't follow this one. Can you describe in terms of the
> proposed vocabulary?

In OWL2, object properties can be said to be functional, inverse functional, transitive, symmetric, anti-symmetric, reflexive, irreflexive, disjoint, inverse, equivalent to another relation or composed (role chain). What we can't say is that a relation is equivalent to the composition of a relation and a type. That said, what you write 

> things with a trans:systolicMPa->X 
> is equivalent to 
>   things with muo:measuredIn->trans1:MPa, muo:numericalValue X . 

which using OWL, would only get you to the class equivalence, but not the transfer of type and "X" value. We would need to at least 

 things with a trans:systolicMPa some xsd:int
  is equivalent to
      systolic-blood-pressure 
         and muo:measuredIn some trans1:MPa 
         and muo:numericalValue some xsd:int

and then to transfer the value
  trans:systomicMPa rdfs:subPropertyOf muo:numericalValue 



> > > > 4) requires one to query the labels instead of the semantics to
> find
> > > the appropriate relation.
> 
> Can you give an example here as well?

- e.g. how do I report systolic blood pressure


> > > > 5) requires one to parse the label for the intended unit.
> 
> I'm not sure the practicality of querying for everything in the
> database which is in MPa, but if you're motivated to do inference,
> it's in the OWL.

- e.g. in what unit should I report blood pressure?

> 
> > > > It's a shortcut that makes linked data prettier, but weakens
> formal
> > > knowledge representation.
> > >
> > >
> > >  > m.
> > > >
> > > >
> > > >
> 
> --
> -ericP

Received on Sunday, 12 September 2010 16:53:49 UTC