Re: [OEP] The n-ary relations draft is ready for outside review from Guus Schreiber on 2005-09-15 (public-swbp-wg@w3.org from September 2005)

From: Guus Schreiber <schreiber@cs.vu.nl>
Date: Thu, 15 Sep 2005 13:57:35 +0200
To: Natasha Noy <noy@stanford.edu>
CC: swbp <public-swbp-wg@w3.org>
Message-ID: <432961AF.9010800@cs.vu.nl>
Natasha Noy wrote:
> Guus,
> 
> Thanks a lot again for your review. At the OEP teleconference today,  we 
> had a long discussion about the points you made in your review  that 
> were not addressed by the new draft published last week [1]
> 
> The feeling of the task force is that it is ok to leave the note as  it 
> is. It really seem to be in the eye of the beholder what is a  "real" 
> n-ary relation and what is not. And there was a general  agreement that 
> RDF reification doesn't really address the issue  brought out in the 
> examples in the note. Thus, the treatment that the  note already gives 
> RDF reification (explaining that it addresses a  different issue) seems 
> sufficient.
> 
> Would you be comfortable moving forward with the note in the state it  
> is now?

Natasha,

In general OK. This weekend I will reply to the points you raised in 
your response to my comments. I expect that will lead to some smaller 
changes.

Guus



> 
> Thanks a lot,
> 
> Natasha
> 
> [1] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Sep/0019.html
> 
> 
> On Aug 8, 2005, at 4:39 AM, Guus Schreiber wrote:
> 
>> Natasha, Alan,
>>
>> Her is my review. Sorry for the delay. The reviews is a bit biased  by 
>> my use of this note in a ontology-engineering course, which  mainly 
>> focused on issues wrt real-world modeling (and not on RDF/ OWL details).
>>
>> Guus
>>
>> PS. My spelling checker wanted me to replace "reification" with  
>> "deification" :-).
>>
>>
>> Defining N-ary Relations on the Semantic Web
>> Editor's Draft 20 June 2005
>> http://smi-web.stanford.edu/people/noy/nAryRelations/n- 
>> aryRelations-2nd-WD.html
>>
>> [[
>>   Issue 1: If property instances can link only two individuals, how do
>>   we deal with cases where we need to describe the instances of
>>   relations, such as its certainty, strength, etc?
>>
>>   Issue 2: If instances of properties can link only two individuals,
>>   how do we represent relations among more than two individuals?
>>   ("n-ary relations")
>>
>>   Issue 3: If properties can link only two individuals, how do we
>>   represent relations in which one of the participants is an ordered
>>   list of individuals rather than a single individual?
>> ]]
>>
>> One could say this is not really a n-ary relation problem, but the
>> "how to make statements about statements" problem, , i.e an
>> alternative for RDF reification. I propose to make this explicit in
>> the text, and move the issue to be the second issue.
>>
>> Vocabulary (issue 1 & 2): some readers might not grasp "property  
>> instances"
>> directly. Suggest to either add in parentheses "cf. tuples" or drop
>> "instances" (as done in the description of issue 3).
>>
>> [[
>>   Use case examples
>> ]]
>>
>> Again, examples 3 is the prototypical n-ary relation, so maybe this
>> should be the first example. The point is that for people from
>> relational databases the first two examples are not "real" n-ary
>> relations: e.g. in example 1 the probability value is functionally
>> dependent on the person and the disease. In example 3 there is no such
>> dependency (the primary key is the combination of all three
>> arguments). So, reification would work with examples 1 and 2, but not
>> with example 3 (because the instances are not unique).
>>
>> [[
>>   4. United Airlines flight 3177 visits the following airports: LAX,
>>   DFW, and JFK. There is a relation between the individual flight and
>>   the three cities that it visits, LAX, DFW, JFK. Note that the order
>>   of the airports is important and indicates the order in which the
>>   flight visits these airports.
>> ]]
>>
>> UML users may not recognize this as an n-ary relation. UML has the
>> notion of "ordered" associations, which would handle this
>> situation. It is in fact a binary relation where one of the arguments
>> is not a simple individual but an ordered list of individuals. I
>> suggest to add a UML note.
>>
>> Reflecting on this, we might just want to say:
>> - issue 2 / example 3 describe the "real" n-ary relation issue
>> - issue 1 and 3 / example 1+2 and 4 describe related but different
>> problems that can be modeled using the same patterns.
>> But maybe I'm making it too complicated now.
>>
>> [[
>> Sec. Representation patterns
>>
>>   ... Examples 1, 2, and 3 above correspond to this pattern. For  
>> instance,
>>   in the example 1 the instance of a new class Diagnosis_Relation
>>   would represent the fact that Christine has been diagnosed with a
>>   breast tumor with high probability.
>> ]]
>>
>> "correspond to" is too strong. Suggest to rephrase as: "Examples 1, 2,
>> and 3 above can be modeled with this pattern.".
>>
>> Maybe it is a good place here to indicate that example 1 and 2
>> could alternatively have been represented with RDF reification.
>>
>> I suggest to include example 3 here, also because a name such
>> as "Purchase" would seem to come less out of the blue than
>> "Diagnosis_Relation".
>>
>> I suggest to include a UML note, indicating that pattern 1 is
>> close to what is called an "association class" in UML.
>>
>> [[
>>   Pattern 1
>> ]]
>>
>> In line with the previous comments, I suggest to change the order of
>> the use cases. The current use case 3 should be the first one.
>>
>> [[
>>   Use Case 1: additional attributes describing a relation
>> ]]
>>
>> I've tried to explain the modeling solution in my
>> ontology-engineering" class and observed the following:
>>
>> - it requires "breast tumor" to be treated as an instance, where it
>> will usually be a class (one could see it as a use case for the
>> "classes as values" note).
>>
>>   I suggest to consider using an instance of BreastTumor as the
>>   value. This also has the advantages described in the value-partition
>>   note (easy to add later the statement that MyBreastTumor is an  
>> instance
>>   of a subclass of "BreastTumor").
>>
>> - there are two other solutions which are worth discussing as
>> alternatives:
>>
>>   1. Person -> hasDiagnosis -> Disease -> hasProbability -> Number
>>   This would work if the instance of disease is not BreastTumor" but
>>   a unique instance of BreastTumor.  By the way, I do not think this
>>   solution would work in practice, as a statement about a diagnosis
>>   with a certain probability is always time dependent (which we cannot
>>   easily add).
>>
>>   2. Representing Diagnosis in a similar way as Purchase.
>>   My students found this solution easier to understand (for whatever
>>   it is worth). They found the juxtaposition of BreastTumor and
>>   Probability weird, as the second is clearly despondent on the
>>   first. The only real difference of course is the direction of the
>>   hasDiagnosis property.
>>
>> [[
>>   Use Case 2: different aspects of the same relation
>> ]]
>>
>> This use case is a better example than use case 1 of how to use the
>> pattern for avoiding the use of RDF reification.
>>
>> A drastic solution could be to drop use case 1 altogether and keep  this
>> one in. Adding time information to this example would make it more
>> realistic.
>>
>> "TemperatureObservation" would be a good name for this relation. I  think
>> this use case is close to the Observation pattern in Fowler's book on
>> Analysis Patterns (I tried to verify this, but I cannot find my
>> copy of the book).
>>
>> [[
>>   Use Case 3: N-ary relation with no distinguished participant
>> ]]
>>
>> I think it is worthwhile to point out that in use case 3 the
>> domain actually provides a natural name for the relation as a whole,
>> namely "Purchase". There are many of these nouns that represent static
>> aspects of an activity and thus are candidates for this pattern:
>> "transaction", "enrollment", "subscription". This makes it different
>> from use cases 1 and 2 (but see also my remarks there).
>>
>> [[
>>   Pattern 2: Using lists for arguments in a relation
>> ]]
>>
>> Alternatives which avoid the use of  RDF list would be worth
>> mentioning:
>>
>> 1. A Flight  is linked to a number of FlightPorts. Each FlightPort  is a
>> class, representing the relation between a port and its sequence  number
>> in the Flight. I find this rather ugly, but it is in a sense close to
>> the way use case 1 is represented.
>>
>> 2. A Flight is linked to a number of FlightMovement instances. Each
>> Flight movement represents a relation between from/to
>> airports. This would probably be my preferred solution.
>>
>> -- 
>> Free University Amsterdam, Computer Science
>> De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
>> Tel: +31 20 598 7739/7718; E-mail: schreiber@cs.vu.nl
>> Home page: http://www.cs.vu.nl/~guus/
>>
> 

-- 
Free University Amsterdam, Computer Science
De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
Tel: +31 20 598 7739/7718; E-mail: schreiber@cs.vu.nl
Home page: http://www.cs.vu.nl/~guus/
Received on Thursday, 15 September 2005 11:57:51 UTC