RE: [OEP] The n-ary relations draft is ready for outside review

Excellent comments, Guus!  A few remarks are given below.

============================================
Mike Uschold
Tel: 425 865-3605              Fax: 425 865-2965
============================================



>  -----Original Message-----
>  From: Guus Schreiber [mailto:schreiber@cs.vu.nl] 
>  Sent: Monday, August 08, 2005 4:40 AM
>  To: Natasha Noy
>  Cc: Ralph R. Swick; swbp
>  Subject: Re: [OEP] The n-ary relations draft is ready for 
>  outside review
>  
>  
>  
>  Natasha, Alan,
>  
>  Her is my review. Sorry for the delay. The reviews is a bit 
>  biased by my 
>  use of this note in a ontology-engineering course, which 
>  mainly focused 
>  on issues wrt real-world modeling (and not on RDF/OWL details).
>  
>  Guus
>  
>  PS. My spelling checker wanted me to replace "reification" with 
>  "deification" :-).
>  
>  
>  Defining N-ary Relations on the Semantic Web
>  Editor's Draft 20 June 2005 
>  http://smi-web.stanford.edu/people/noy/nAryRelations/n-aryRel
>  ations-2nd-WD.html
>  
>  [[
>     Issue 1: If property instances can link only two 
>  individuals, how do
>     we deal with cases where we need to describe the instances of
>     relations, such as its certainty, strength, etc?
>  
>     Issue 2: If instances of properties can link only two individuals,
>     how do we represent relations among more than two individuals?
>     ("n-ary relations")
>  
>     Issue 3: If properties can link only two individuals, how do we
>     represent relations in which one of the participants is an ordered
>     list of individuals rather than a single individual?
>  ]]
>  
>  One could say this is not really a n-ary relation problem, 
>  but the "how to make statements about statements" problem, , 
>  i.e an alternative for RDF reification. I propose to make 
>  this explicit in the text, and move the issue to be the second issue.


Excellent point, see also my comment below.

>  
>  Vocabulary (issue 1 & 2): some readers might not grasp 
>  "property instances" directly. Suggest to either add in 
>  parentheses "cf. tuples" or drop "instances" (as done in the 
>  description of issue 3).


I agree with this.

>  
>  [[
>     Use case examples
>  ]]
>  
>  Again, examples 3 is the prototypical n-ary relation, so 
>  maybe this should be the first example. The point is that 
>  for people from relational databases the first two examples 
>  are not "real" n-ary
>  relations: e.g. in example 1 the probability value is 
>  functionally dependent on the person and the disease. In 
>  example 3 there is no such dependency (the primary key is 
>  the combination of all three arguments). So, reification 
>  would work with examples 1 and 2, but not with example 3 
>  (because the instances are not unique).

Excellent point. I don't think any of us would want to try to precisely
define what  a 'real' n-ary relation is.  Maybe it is safest to say
something like: Here are three cases where it is natural and/or
convenient to use n-ary relations.   

>  
>  [[
>     4. United Airlines flight 3177 visits the following airports: LAX,
>     DFW, and JFK. There is a relation between the individual 
>  flight and
>     the three cities that it visits, LAX, DFW, JFK. Note that 
>  the order
>     of the airports is important and indicates the order in which the
>     flight visits these airports.
>  ]]
>  
>  UML users may not recognize this as an n-ary relation. UML 
>  has the notion of "ordered" associations, which would handle 
>  this situation. It is in fact a binary relation where one of 
>  the arguments is not a simple individual but an ordered list 
>  of individuals. I suggest to add a UML note.
>  
>  Reflecting on this, we might just want to say:
>  - issue 2 / example 3 describe the "real" n-ary relation issue
>  - issue 1 and 3 / example 1+2 and 4 describe related but 
>  different problems that can be modeled using the same 
>  patterns. But maybe I'm making it too complicated now.

Yes, this is too complicated, see my above comment for how we could
handle this.

>  
>  [[
>  Sec. Representation patterns
>  
>     ... Examples 1, 2, and 3 above correspond to this 
>  pattern. For instance,
>     in the example 1 the instance of a new class Diagnosis_Relation
>     would represent the fact that Christine has been diagnosed with a
>     breast tumor with high probability.
>  ]]
>  
>  "correspond to" is too strong. Suggest to rephrase as: 
>  "Examples 1, 2, and 3 above can be modeled with this pattern.".

Right, this aligns with my suggestion for all the examples: they can
naturally and/or conveniently be modeled using n-ary relations.

>  
>  Maybe it is a good place here to indicate that example 1 and 
>  2 could alternatively have been represented with RDF reification.
>  
>  I suggest to include example 3 here, also because a name 
>  such as "Purchase" would seem to come less out of the blue 
>  than "Diagnosis_Relation".
>  
>  I suggest to include a UML note, indicating that pattern 1 
>  is close to what is called an "association class" in UML.
>  
>  [[
>     Pattern 1
>  ]]
>  
>  In line with the previous comments, I suggest to change the 
>  order of the use cases. The current use case 3 should be the 
>  first one.
>  
>  [[
>     Use Case 1: additional attributes describing a relation
>  ]]
>  
>  I've tried to explain the modeling solution in my 
>  ontology-engineering" class and observed the following:
>  
>  - it requires "breast tumor" to be treated as an instance, 
>  where it will usually be a class (one could see it as a use 
>  case for the "classes as values" note).
>  
>     I suggest to consider using an instance of BreastTumor as the
>     value. This also has the advantages described in the 
>  value-partition
>     note (easy to add later the statement that MyBreastTumor 
>  is an instance
>     of a subclass of "BreastTumor").
>  
>  - there are two other solutions which are worth discussing as
>  alternatives:
>  
>     1. Person -> hasDiagnosis -> Disease -> hasProbability -> Number
>     This would work if the instance of disease is not BreastTumor" but
>     a unique instance of BreastTumor.  By the way, I do not think this
>     solution would work in practice, as a statement about a diagnosis
>     with a certain probability is always time dependent 
>  (which we cannot
>     easily add).
>  
>     2. Representing Diagnosis in a similar way as Purchase.
>     My students found this solution easier to understand (for whatever
>     it is worth). They found the juxtaposition of BreastTumor and
>     Probability weird, as the second is clearly despondent on the
>     first. The only real difference of course is the direction of the
>     hasDiagnosis property.

I guess you don't mean 'dependent', not 'despondent'. The latter is what
Natasha might be feeling with yet another round of challenging technical
points to weave into the note :-).

>  [[
>     Use Case 2: different aspects of the same relation
>  ]]
>  
>  This use case is a better example than use case 1 of how to 
>  use the pattern for avoiding the use of RDF reification.
>  
>  A drastic solution could be to drop use case 1 altogether 
>  and keep this one in. Adding time information to this 
>  example would make it more realistic.
>  
>  "TemperatureObservation" would be a good name for this 
>  relation. I think this use case is close to the Observation 
>  pattern in Fowler's book on Analysis Patterns (I tried to 
>  verify this, but I cannot find my copy of the book).
>  
>  [[
>     Use Case 3: N-ary relation with no distinguished participant ]]
>  
>  I think it is worthwhile to point out that in use case 3 the 
>  domain actually provides a natural name for the relation as 
>  a whole, namely "Purchase". There are many of these nouns 
>  that represent static aspects of an activity and thus are 
>  candidates for this pattern: "transaction", "enrollment", 
>  "subscription". This makes it different from use cases 1 and 
>  2 (but see also my remarks there).
>  
>  [[
>     Pattern 2: Using lists for arguments in a relation
>  ]]
>  
>  Alternatives which avoid the use of  RDF list would be worth
>  mentioning:
>  
>  1. A Flight  is linked to a number of FlightPorts. Each 
>  FlightPort is a class, representing the relation between a 
>  port and its sequence number in the Flight. I find this 
>  rather ugly, but it is in a sense close to the way use case 
>  1 is represented.
>  
>  2. A Flight is linked to a number of FlightMovement 
>  instances. Each Flight movement represents a relation 
>  between from/to airports. This would probably be my 
>  preferred solution.
>  
>  -- 
>  Free University Amsterdam, Computer Science
>  De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
>  Tel: +31 20 598 7739/7718; E-mail: schreiber@cs.vu.nl
>  Home page: http://www.cs.vu.nl/~guus/
>  
>  

Received on Monday, 8 August 2005 17:13:48 UTC