Re: [OEP] The n-ary relations draft is ready for outside review from Guus Schreiber on 2005-10-05 (public-swbp-wg@w3.org from October 2005)

From: Guus Schreiber <schreiber@cs.vu.nl>
Date: Wed, 05 Oct 2005 22:28:14 +0200
To: Natasha Noy <noy@smi.stanford.edu>
CC: public-swbp-wg@w3.org
Message-ID: <4344375E.9090609@cs.vu.nl>
Natasha,

Here is finally my reaction to your August response [1]. Only the last 
point needs some more consideration from my point of view.

Sorry to have kept you waiting.

Guus

[1] http://lists.w3.org/Archives/Public/public-swbp-wg/2005Aug/0023.html

> From: Natasha Noy <noy@SMI.Stanford.EDU>
> Date: Thu, 11 Aug 2005 14:34:59 -0700
> Cc: "Ralph R. Swick" <swick@w3.org>, swbp <public-swbp-wg@w3.org>
> To: Guus Schreiber <schreiber@cs.vu.nl>
> 
> 
> Guus,
> 
> Thank you very much for your comments. Some replies/discussion below.  
> I think some points require a bit more discussion (there maybe some  
> misunderstanding).
> 
>> [[
>>   Issue 1: If property instances can link only two individuals, how do
>>   we deal with cases where we need to describe the instances of
>>   relations, such as its certainty, strength, etc?
>>
>>   Issue 2: If instances of properties can link only two individuals,
>>   how do we represent relations among more than two individuals?
>>   ("n-ary relations")
>>
>>   Issue 3: If properties can link only two individuals, how do we
>>   represent relations in which one of the participants is an ordered
>>   list of individuals rather than a single individual?
>> ]]
>>
>> One could say this is not really a n-ary relation problem, but the
>> "how to make statements about statements" problem, , i.e an
>> alternative for RDF reification. I propose to make this explicit in
>> the text, and move the issue to be the second issue.
> 
> I think there are two issues here. One is what do people mean by the  
> term "n-ary relations". Most of the times I've seen this come up,  
> people were referring to either the first or the second of the issues  
> above. In KR, I think we all agree that the term we think about is  
> "reification", but that has been taken from us by RDF to mean  
> something different. Second, on RDF reification: I am not sure if you  
> are suggesting that the same issue is addressed by RDF reification. I  
> don't believe it is. In RDF, reification is really statements about  
> statement, where it is not even implied that the latter statement  
> holds. Here, we are really putting additional information on the  
> relationship that does hold (at least we are trying to make that  
> assertion)

OK, I accept your point.

> 
>> Vocabulary (issue 1 & 2): some readers might not grasp "property  
>> instances"
>> directly. Suggest to either add in parentheses "cf. tuples" or drop
>> "instances" (as done in the description of issue 3).
> 
> I think we've put "property instances" to distinguish from properties  
> and in the effort to have consistent terminology throughout the  
> document. Issue 3 should also refer to "property instances". We can  
> add "cf. tuples" to clarify the issue.
> 
>> [[
>>   Use case examples
>> ]]
>>
>> Again, examples 3 is the prototypical n-ary relation, so maybe this
>> should be the first example. The point is that for people from
>> relational databases the first two examples are not "real" n-ary
>> relations: e.g. in example 1 the probability value is functionally
>> dependent on the person and the disease. In example 3 there is no such
>> dependency (the primary key is the combination of all three
>> arguments). So, reification would work with examples 1 and 2, but not
>> with example 3 (because the instances are not unique).
> 
> Again, if you have a different term to refer to all of these (that  
> doesn't use the word "reification") instead of "n-ary relation", we  
> can try to change. What the note is really about are these types of  
> relations, and there is probably not a single term that would be  
> accepted in all communities. Again, if we could say "reified  
> relations", we probably would.
> 
> I also like Mike's suggestion on how to summarize these three cases  
> -- we'll weave that in.

OK. Happy with the current version.

> 
>>
>> [[
>>   4. United Airlines flight 3177 visits the following airports: LAX,
>>   DFW, and JFK. There is a relation between the individual flight and
>>   the three cities that it visits, LAX, DFW, JFK. Note that the order
>>   of the airports is important and indicates the order in which the
>>   flight visits these airports.
>> ]]
>>
>> UML users may not recognize this as an n-ary relation. UML has the
>> notion of "ordered" associations, which would handle this
>> situation. It is in fact a binary relation where one of the arguments
>> is not a simple individual but an ordered list of individuals. I
>> suggest to add a UML note.
> 
> I am not sure how to phrase this. Could you perhaps provide some  
> specific text that you would like to see there?

I will come back to this under the last point. Maybe the UML note is not 
required.

> 
>>
>> Reflecting on this, we might just want to say:
>> - issue 2 / example 3 describe the "real" n-ary relation issue
>> - issue 1 and 3 / example 1+2 and 4 describe related but different
>> problems that can be modeled using the same patterns.
>> But maybe I'm making it too complicated now.
> 
> perhaps? (I really think these things would mean different things to  
> different communities and we won't be able to satisfy them all; we  
> just need to explain what *we* mean by these terms, and hopefully the  
> examples do that. If not, we need to do a better job)

OK, leave it as it is.

> 
>> [[
>> Sec. Representation patterns
>>
>>   ... Examples 1, 2, and 3 above correspond to this pattern. For  
>> instance,
>>   in the example 1 the instance of a new class Diagnosis_Relation
>>   would represent the fact that Christine has been diagnosed with a
>>   breast tumor with high probability.
>> ]]
>>
>> "correspond to" is too strong. Suggest to rephrase as: "Examples 1, 2,
>> and 3 above can be modeled with this pattern.".
> 
> sure
> 
>> Maybe it is a good place here to indicate that example 1 and 2
>> could alternatively have been represented with RDF reification.
> 
> I don't think this is true though. If we represented this with RDF  
> reification, we wouldn't actually be making a statement that Steve  
> has high temperature for example; rather about the statement about  
> his temperature -- that's a crucial difference.

I'm happy to leave it as it is.

> 
>> I suggest to include example 3 here, also because a name such
>> as "Purchase" would seem to come less out of the blue than
>> "Diagnosis_Relation".
> 
> sure
> 
>> I suggest to include a UML note, indicating that pattern 1 is
>> close to what is called an "association class" in UML.
> 
> Again, I would appreciate some specific text since whatever I say  
> would end up being imprecise since I know very little about UML.

Proposed text:

[[
   UML Note: The "Purchase" use case would in UML typically be modelled 
as an association class, with the object properties represented as 
attributes of the association class.
]]

> 
>> [[
>>   Pattern 1
>> ]]
>>
>> In line with the previous comments, I suggest to change the order of
>> the use cases. The current use case 3 should be the first one.
>>
>> [[
>>   Use Case 1: additional attributes describing a relation
>> ]]
>>
>> I've tried to explain the modeling solution in my
>> ontology-engineering" class and observed the following:
>>
>> - it requires "breast tumor" to be treated as an instance, where it
>> will usually be a class (one could see it as a use case for the
>> "classes as values" note).
>>
>>   I suggest to consider using an instance of BreastTumor as the
>>   value. This also has the advantages described in the value-partition
>>   note (easy to add later the statement that MyBreastTumor is an  
>> instance
>>   of a subclass of "BreastTumor").
> 
> yes, this has come up before. I think your solution of having this as  
> an instance would solve the problem
> 
>> - there are two other solutions which are worth discussing as
>> alternatives:
>>
>>   1. Person -> hasDiagnosis -> Disease -> hasProbability -> Number
>>   This would work if the instance of disease is not BreastTumor" but
>>   a unique instance of BreastTumor.  By the way, I do not think this
>>   solution would work in practice, as a statement about a diagnosis
>>   with a certain probability is always time dependent (which we cannot
>>   easily add).
> 
> If we do use the instance as you suggested above, how is this  
> solution different? You simply call Disease what the note calls  
> "Diagnosis_Relation", no?

What I mean is: if you introduce an instance as standing for the breast 
tumor *of Christine*, there is no need for the pattern. Chris and I 
discussed the issue here at K-CAP. He suggested that this was similar to 
"slicing" and proposed to leave this to a further note. I'm happy with 
that.

> 
>>   2. Representing Diagnosis in a similar way as Purchase.
>>   My students found this solution easier to understand (for whatever
>>   it is worth). They found the juxtaposition of BreastTumor and
>>   Probability weird, as the second is clearly despondent on the
>>   first. The only real difference of course is the direction of the
>>   hasDiagnosis property.
> 
> Why did they find it weird? That's exactly why we need reification  
> here -- to link the two. I don't see what the problem is.

OK to leave it as it is.

> 
>> [[
>>   Use Case 2: different aspects of the same relation
>> ]]
>>
>> This use case is a better example than use case 1 of how to use the
>> pattern for avoiding the use of RDF reification.
> 
> I am not sure I understand (perhaps with the comments above on RDF  
> reification, this is no longer valid?)

OK to leave it as it is.

> 
>>
>> A drastic solution could be to drop use case 1 altogether and keep  
>> this
>> one in. Adding time information to this example would make it more
>> realistic.
> 
> I really don't want to add time -- this would open such a huge can of  
> worms. It would be very hard to explain to people that this has  
> nothing to do with time actually. I would really like to stay out of it.
> 
> I think use case 1 does present a common way that people come across  
> this problem, and lack of n-ary relations in OWL. Thus, even though  
> it is logically equivalent to other cases, I would prefer to keep it in.

OK to leave it as it is.

> 
>> "TemperatureObservation" would be a good name for this relation. I  
>> think
>> this use case is close to the Observation pattern in Fowler's book on
>> Analysis Patterns (I tried to verify this, but I cannot find my
>> copy of the book).
> 
> yes, that's better
> 
>> [[
>>   Use Case 3: N-ary relation with no distinguished participant
>> ]]
>>
>> I think it is worthwhile to point out that in use case 3 the
>> domain actually provides a natural name for the relation as a whole,
>> namely "Purchase". There are many of these nouns that represent static
>> aspects of an activity and thus are candidates for this pattern:
>> "transaction", "enrollment", "subscription". This makes it different
>> from use cases 1 and 2 (but see also my remarks there).
> 
> agreed
> 
>> [[
>>   Pattern 2: Using lists for arguments in a relation
>> ]]
>>
>> Alternatives which avoid the use of  RDF list would be worth
>> mentioning:
>>
>> 1. A Flight  is linked to a number of FlightPorts. Each FlightPort  
>> is a
>> class, representing the relation between a port and its sequence  
>> number
>> in the Flight. I find this rather ugly, but it is in a sense close to
>> the way use case 1 is represented.
>>
>> 2. A Flight is linked to a number of FlightMovement instances. Each
>> Flight movement represents a relation between from/to
>> airports. This would probably be my preferred solution.
> 
> Ok, I'll try to put them in. Would it be ok simple to mention this or  
> do you think it needs a fleshed out example, with code, diagrams, etc.?

Chris and I discussed this here as well. We think it requires a bit more 
discussion to get the list pattern right, e.g, at the next OEP telecon.

> 
> Thanks again for your comments!

Thanks a lot to you for all your work and the patience.

Guus
> 
> Natasha
> 
> Received on Thursday, 11 August 2005 21:34:59 GMT
> 
>     * This message: [ Message body ]
>     * Next message: Christopher Welty: "Re: [ALL] editors draft of simple part-whole note ready for review"
>     * Previous message: Christopher Welty: "[All] URL scheme for published vocabularies"
>     * In reply to: Guus Schreiber: "Re: [OEP] The n-ary relations draft is ready for outside review"
>     * Next in thread: Christopher Welty: "Re: [OEP] The n-ary relations draft is ready for outside review"
>     * Reply: Christopher Welty: "Re: [OEP] The n-ary relations draft is ready for outside review"
>     * Reply: Frank Manola: "Re: [OEP] The n-ary relations draft is ready for outside review"
> 
>     * Mail actions: [ respond to this message ] [ mail a new topic ]
>     * Contemporary messages sorted: [ by date ] [ by thread ] [ by subject ] [ by author ]
>     * Help: [ How to use the archives ] [ Search in the archives ] 
> 
> This archive was generated by hypermail 2.2.0 + w3c-0.30 : Thursday, 11 August 2005 21:34:59 GMT

-- 
Free University Amsterdam, Computer Science
De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
Tel: +31 20 598 7739/7718; e-mail: schreiber@cs.vu.nl
Home page: http://www.cs.vu.nl/~guus/
Received on Wednesday, 5 October 2005 20:28:33 UTC