- From: Eric Prud'hommeaux <eric@w3.org>
- Date: Sun, 12 Sep 2010 14:55:20 -0400
- To: Chimezie Ogbuji <ogbujic@ccf.org>
- Cc: Michel_Dumontier <Michel_Dumontier@carleton.ca>, "public-semweb-lifesci@w3.org" <public-semweb-lifesci@w3.org>
* Chimezie Ogbuji <ogbujic@ccf.org> [2010-09-11 13:33-0400] > Just wanted to pick up on the follow-your-nose argument. You can pick your friends, and you can ... I should say that I'm not a devout follow-your-nosist, but I do see it as the default behavoir on the SemWeb (until we get some nifty ontology repos which are easier to use than conventional dereference). > On 9/10/10 10:06 PM, "Eric Prud'hommeaux" <eric@w3.org> wrote: > > Right, but tells whom, and when? including muo:measuredIn advertises > > a flexibility which you do not intend to honor. If I dereference > > :systolicMPa, I learn that the units are exactly MPa. > > How would you learn this? It seems like you are suggesting a person would do > the dereference and *see* a web page stating in a form that is not > machine-readable that the units are Mpa. Otherwise, if the dereference is > meant to inform a machine, then wouldn't some machine readable format be > needed for this (i.e., an ontology or set of rules)? Enumerating some ways someone can gleening the semantics for something: 1. read the predicate name in English. 2a. dereference it and read the HTML definition. b. RDFS definition. c. OWL definition. 3abc. google for it and read the HTML, RDFS, OWL 4abc. dig through contracts and search for the HTML, RDFS, OWL. 5 laws The person who sees "tmo:systolicBP" and says "ok, I know what that is" (i.e. 1.) will get the impression that they can use whatever units they choose (regardless of whether they research muo:numericalValue and muo:measuredIn for specifics). This won't be everyone, but it will be some folks, especially during the initial phases of standard adoption. The proposed approach of recording units and values, but restricting muo:measuredIn, depends on someone comprehensively reading the OWL (or presumably, HTML) definitions for tmo:systolicBP. An intuitive predicate label and no units reduces that risk. > > If I dereference > > muo:numericalValue and muo:measuredUnits, I learn that I can use any > > units (misleading). > > No need to dereference. muo-vocab.owl tells the machine all it needs to > know about measuredIn, for example: > [I'm using "muo:" instead of "uomvocab:" below - ericP] > uomvocab:measuredIn > a owl:FunctionalProperty, > owl:ObjectProperty; > rdfs:domain uomvocab:QualityValue; > rdfs:range uomvocab:UnitOfMeasurement. > > In particular, it tells the machine that only one unit can be associated > with the domain and that the relationship holds between quality values and a > unit of measurement. In what way is this misleading? It states that any muo:UnitOfMeasurement is a valid object of muo:measuredIn, which *implies* (barring further restrictions) that a tmo:SystolicBP may have a muo:measuredIn of any muo:UnitOfMeasurement (which precludes, say, "green", but not "mmHg"). The crux of my argument lies in the "barring further restrictions" condition; TMO.owl may say that the object of of a tmo:systolicBP arg is a tmo:SystolicBP, and that a tmo:SystolicBP must have one uom:measureIn equal to u:MPa, but means that we only want TMO to be used or considered by folks who commprehensively read TMO.owl. TMO.html may also contain this, but just reading the names of the predicates (1.), which many people do, implies a freedon of units. > > If I wade through the OWL for TMO, I learn that > > there's a restriction for say: > > > > Class: tmo:BloodSystolicPressureReading EquivalentTo: > > (:value exactly 1) > > and (muo:measuredIn exactly u:mmHg) > > > > which, if I think hard, tells me that I must normalize my data, but > > this is pretty far from follow-your-nose semantics. > > Yes, it is far from follow-your-nose semantics because it requires logical > deduction to interpret whereas follow-your-nose only requires the lowest > common denominator: (opportunistic) network lookup. Is there a need to > follow-your-nose if the (raw) data was meant for machine consumption, is > rich in meaning, and clearly indicates the artifact to use in interpreting > it logically? I think there is, particularly during the early adoption phase. I'd like to position us to be a viable standard for clinical data presentation and exchange. I'd like hackers from Google Health, MS Health Vault, Indivo, MLHIM, I2B2, etc. to be able to glance at a TMO example and start writing code to use it. I'd specifically not like to restrict that pool of collaborators to those able to implement (either intellectually, or by tool use) A-box OWL consistency contraints. I understand that this flies in the face of some solid ontology practices, but I don't think they're germane to single-unit standard representations. > > I think I have described why authoring is less fault-prone if the > > normalized date in TMO uses precise predicates. Do you have other use > > cases which override that one? > > > >> m. > >> > >>> > >>> > >>>>>> Also, having domain-independent predicates makes it easier to > >>> render > >>>>> a view > >>>>>> of the data (for human consumption) that includes visual cues > >>>>> regarding the > >>>>>> units of measures associated with values directly from the data > >>> since > >>>>> such > >>>>>> tools will always expect the same set of terms to capture a value > >>> and > >>>>> its > >>>>>> unit of measurement. > >>>>> > >>>>> If you've bought the argument for early normalization, isn't it > >>>>> needlessly dangerous to offer the freedom to express BP in mmHg in > >>> an > >>>>> ontology that's required to have BP in MPa? It does put more burden > >>> on > >>>>> the use of generic data browsers (they'd have to read the OWL in > >>> order > >>>>> to present the user with units), but I think that use case is small > >>>>> compared to the cost of data consumption. > >>>> > >>>> I don't think we should tailor our data model to generic data > >>> browsers - they are far too simple for the complex knowledge that we > >>> have to represent. > >>>> > >>>> m. > >>> > >>> -- > >>> -ericP > > > =================================== > > P Please consider the environment before printing this e-mail > > Cleveland Clinic is ranked one of the top hospitals > in America by U.S.News & World Report (2009). > Visit us online at http://www.clevelandclinic.org for > a complete listing of our services, staff and > locations. > > > Confidentiality Note: This message is intended for use > only by the individual or entity to which it is addressed > and may contain information that is privileged, > confidential, and exempt from disclosure under applicable > law. If the reader of this message is not the intended > recipient or the employee or agent responsible for > delivering the message to the intended recipient, you are > hereby notified that any dissemination, distribution or > copying of this communication is strictly prohibited. If > you have received this communication in error, please > contact the sender immediately and destroy the material in > its entirety, whether electronic or hard copy. Thank you. > -- -ericP
Received on Sunday, 12 September 2010 18:55:56 UTC