- From: Alan Rector <rector@cs.man.ac.uk>
- Date: Fri, 20 Aug 2004 17:37:35 +0100
- To: Natasha Noy <noy@SMI.Stanford.EDU>
- CC: Samson Tu <tu@SMI.Stanford.EDU>, public-swbp-wg@w3.org
Samson, Natasha I think there is more to be bottomed out here. I don't think it is generally the case that you can derive the 'key' in the database sense from the OWL (or other logical) relationship, although you can identify potential keys, i.e. sets of entity types that uniquely determine the rest. In the example ""At 2004/07/31 10:00, Christine's temperature is 39 degrees" I could - and probably would - argue that the key is Christine + date/time and the value is 39 degrees. The choice of 'key' from the 'potential keys' is an implementation and usage issue rather than a 'logical' issue. If I get the cardinalities right. Christine has many Temperatures each of which has exactly one time and one value. Each temperature is the temperature for exactly one person. Each time is the time for exactly one temperature for one person. We can express all this in FoL, but in DLs or OWL it gets more complicated. Roughly and briefly and assuming we get the details of data types sorted... one might try... Feature QuantitativeFeature TemperatureFeature hasTemperatureFeature object_property multi-valued domain: Person, range: TemperatureFeature has magnitude data_type_property functional; domain QuantitativeFeature; range number. has_time_value data_type_property functional & inverse_functional domain: Feature range: time_data_type I don't have my language lawyer papers with me so I am not sure if you can have inverse functional data type properties. If you can't, then the above becomes even more complicated and you can't quite express all the constraints in OWL. However, the above doesn't work because we have a general has_time_value property which is inverse functional, meaning one time can only be the time of one thing. We would need a separate subproperty for the time stamp for each feature, feasible but very clumsy. A smoother solution - and the source of my intuition that it is the Person and Time that are the natural key is the following - more general versions Person Person_as_observed_at_some_time = Person & restriction(observed_at some Time) hasTemperatureFeature object_property functional domain: Person_as_observed_at_some_time; range TemperatureFeature etc. (These constraints are rather more easily and clearly expressed using QCRs, but that's another story). If you don't like the notion of an entity Person_as_observed_at_some_time, the same effect can be had by using a wrapper such as "Situation" and then saying that it consists of a subjct and a time. But this is perhaps more awkward because this is not exactly what the usual use of "situation" is in situation calculus and the like, although related to it. I'd prefer the person_at_some_time construct myself. and see below They key point is that has_time_value is inverse Natasha Noy wrote: > Samson, > > > However, there is a possible semantic distinction between the two that > > should be modeled explicitly. "Christine has breast tumor" uniquely > > determines "high probablity." If you think of > > diagnostic-relation/person/disease/probability as a relation, then the > > first 3 forms a key. or the first two form a key for a value that is a probability-disease pair. Which is closer to the version in the paper. The encapsulation can go either way. Or they can be in a neutral environment where the query can be done either way and what matters is the uniqueness. > > > All the more reason to replace this example. Indeed -- and this has > come up before -- there is a question of what we are modeling here. > Does Christine have the tumor or does she just have the high (or low) > probability of having it (that is, we are not saying anything about her > actually having this tumor). You can validly say either one of those > things (and it was the latter that we had in mind for the example), > but, perhaps, to avoid confusion we should just change the example? Perhaps, but it is an important enough example to want to nail it at some point. Also I don't understand your distinction above, and I am not sure that a unique intepretation can be put on the formal representation. I can't think of any interpretation in which her having a tumour, and then having the probability, on the relation, but then having the tumour definitely makes sense. Maybe this is a question of labelling. If we called the relation "has_diagnosis_with_probability" that would clarify the reading. But I think this is a labelling issue. > > > > A use case "At 2004/07/31 10:00, Christine's temperature is 39 > > degrees" illustrates the the distinction better than the "Steve" use > > case. In the temperature/person/value/time-stamp relation, > > temperature/person/value triplet does not determine the time stamp. In > > that sense, the "value/time-stamp" pair is the real "value" of the > > binary temperature relation. > > You nailed the issue here (better than we have!) The idea of two > different use cases was that in one we wanted to have some additional > information that is ancillary to the relation and in the other we want > to have all the components to form a "key". Something like severity of > a particular symptom can be a better example for the former. I would > like to avoid getting into timestamps though as this another one of > those issues that people have strong opinions about and will only cloud > the problem we are discussing in this note. > > Natasha -- Alan L Rector Professor of Medical Informatics Department of Computer Science University of Manchester Manchester M13 9PL, UK TEL: +44-161-275-6188/6149/7183 FAX: +44-161-275-6236/6204 Room: 2.88a, Kilburn Building email: rector@cs.man.ac.uk web: www.cs.man.ac.uk/mig www.opengalen.org www.clinical-escience.org www.co-ode.org
Received on Friday, 20 August 2004 17:59:05 UTC