- From: Guus Schreiber <schreiber@cs.vu.nl>
- Date: Mon, 08 Aug 2005 13:39:30 +0200
- To: Natasha Noy <noy@SMI.Stanford.EDU>
- CC: "Ralph R. Swick" <swick@w3.org>, swbp <public-swbp-wg@w3.org>
Natasha, Alan, Her is my review. Sorry for the delay. The reviews is a bit biased by my use of this note in a ontology-engineering course, which mainly focused on issues wrt real-world modeling (and not on RDF/OWL details). Guus PS. My spelling checker wanted me to replace "reification" with "deification" :-). Defining N-ary Relations on the Semantic Web Editor's Draft 20 June 2005 http://smi-web.stanford.edu/people/noy/nAryRelations/n-aryRelations-2nd-WD.html [[ Issue 1: If property instances can link only two individuals, how do we deal with cases where we need to describe the instances of relations, such as its certainty, strength, etc? Issue 2: If instances of properties can link only two individuals, how do we represent relations among more than two individuals? ("n-ary relations") Issue 3: If properties can link only two individuals, how do we represent relations in which one of the participants is an ordered list of individuals rather than a single individual? ]] One could say this is not really a n-ary relation problem, but the "how to make statements about statements" problem, , i.e an alternative for RDF reification. I propose to make this explicit in the text, and move the issue to be the second issue. Vocabulary (issue 1 & 2): some readers might not grasp "property instances" directly. Suggest to either add in parentheses "cf. tuples" or drop "instances" (as done in the description of issue 3). [[ Use case examples ]] Again, examples 3 is the prototypical n-ary relation, so maybe this should be the first example. The point is that for people from relational databases the first two examples are not "real" n-ary relations: e.g. in example 1 the probability value is functionally dependent on the person and the disease. In example 3 there is no such dependency (the primary key is the combination of all three arguments). So, reification would work with examples 1 and 2, but not with example 3 (because the instances are not unique). [[ 4. United Airlines flight 3177 visits the following airports: LAX, DFW, and JFK. There is a relation between the individual flight and the three cities that it visits, LAX, DFW, JFK. Note that the order of the airports is important and indicates the order in which the flight visits these airports. ]] UML users may not recognize this as an n-ary relation. UML has the notion of "ordered" associations, which would handle this situation. It is in fact a binary relation where one of the arguments is not a simple individual but an ordered list of individuals. I suggest to add a UML note. Reflecting on this, we might just want to say: - issue 2 / example 3 describe the "real" n-ary relation issue - issue 1 and 3 / example 1+2 and 4 describe related but different problems that can be modeled using the same patterns. But maybe I'm making it too complicated now. [[ Sec. Representation patterns ... Examples 1, 2, and 3 above correspond to this pattern. For instance, in the example 1 the instance of a new class Diagnosis_Relation would represent the fact that Christine has been diagnosed with a breast tumor with high probability. ]] "correspond to" is too strong. Suggest to rephrase as: "Examples 1, 2, and 3 above can be modeled with this pattern.". Maybe it is a good place here to indicate that example 1 and 2 could alternatively have been represented with RDF reification. I suggest to include example 3 here, also because a name such as "Purchase" would seem to come less out of the blue than "Diagnosis_Relation". I suggest to include a UML note, indicating that pattern 1 is close to what is called an "association class" in UML. [[ Pattern 1 ]] In line with the previous comments, I suggest to change the order of the use cases. The current use case 3 should be the first one. [[ Use Case 1: additional attributes describing a relation ]] I've tried to explain the modeling solution in my ontology-engineering" class and observed the following: - it requires "breast tumor" to be treated as an instance, where it will usually be a class (one could see it as a use case for the "classes as values" note). I suggest to consider using an instance of BreastTumor as the value. This also has the advantages described in the value-partition note (easy to add later the statement that MyBreastTumor is an instance of a subclass of "BreastTumor"). - there are two other solutions which are worth discussing as alternatives: 1. Person -> hasDiagnosis -> Disease -> hasProbability -> Number This would work if the instance of disease is not BreastTumor" but a unique instance of BreastTumor. By the way, I do not think this solution would work in practice, as a statement about a diagnosis with a certain probability is always time dependent (which we cannot easily add). 2. Representing Diagnosis in a similar way as Purchase. My students found this solution easier to understand (for whatever it is worth). They found the juxtaposition of BreastTumor and Probability weird, as the second is clearly despondent on the first. The only real difference of course is the direction of the hasDiagnosis property. [[ Use Case 2: different aspects of the same relation ]] This use case is a better example than use case 1 of how to use the pattern for avoiding the use of RDF reification. A drastic solution could be to drop use case 1 altogether and keep this one in. Adding time information to this example would make it more realistic. "TemperatureObservation" would be a good name for this relation. I think this use case is close to the Observation pattern in Fowler's book on Analysis Patterns (I tried to verify this, but I cannot find my copy of the book). [[ Use Case 3: N-ary relation with no distinguished participant ]] I think it is worthwhile to point out that in use case 3 the domain actually provides a natural name for the relation as a whole, namely "Purchase". There are many of these nouns that represent static aspects of an activity and thus are candidates for this pattern: "transaction", "enrollment", "subscription". This makes it different from use cases 1 and 2 (but see also my remarks there). [[ Pattern 2: Using lists for arguments in a relation ]] Alternatives which avoid the use of RDF list would be worth mentioning: 1. A Flight is linked to a number of FlightPorts. Each FlightPort is a class, representing the relation between a port and its sequence number in the Flight. I find this rather ugly, but it is in a sense close to the way use case 1 is represented. 2. A Flight is linked to a number of FlightMovement instances. Each Flight movement represents a relation between from/to airports. This would probably be my preferred solution. -- Free University Amsterdam, Computer Science De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands Tel: +31 20 598 7739/7718; E-mail: schreiber@cs.vu.nl Home page: http://www.cs.vu.nl/~guus/
Received on Monday, 8 August 2005 11:39:40 UTC