[OEP] - N-ary Relations Note comments from Uschold, Michael F on 2005-05-23 (public-swbp-wg@w3.org from May 2005)

From: Uschold, Michael F <michael.f.uschold@boeing.com>
Date: Mon, 23 May 2005 10:58:02 -0700
To: <public-swbp-wg@w3.org>
Cc: <rector@cs.man.ac.uk>, <noy@SMI.Stanford.EDU>
Message-ID: <4301AFA5A72736428DA388B73676A3810B4449@XCH-NW-6V1.nw.nos.boeing.com>

OEP - N-ary Relations
1.1 GENERAL
Excellent, good examples, clearly described.

Good job on being clear and consistent with terminology about relations.

Frequently an abstract comment comes before the example. Often I could not understand the abstract remark till I read the example. I suggest putting the examples first in most cases. Then, for readers who may be interested, state the general principle been illustrated, in more abstract terms.

I could not find any reason for whether I should choose Pattern 1 over Pattern 2, in terms of really practical consequences. The only one seemed to be that it may be intuitively easier to understand. This is important, but much weaker than other consequences we have discussed in other notes.

More importantly, patterns 1 and 2 are not only logically equivalent, but there are no clear guidelines for applying one instead of the other, it is more a matter of preference. I strongly prefer shifting the emphasis as follows:
* have just one pattern, and two variants - why?
o they are logically equivalent
o one morphs into the other merely by turning around the direction of a relationship arrow (using inverse function makes them identical)
o all of the examples for either P1 or P2 could reasonably be represented using the other pattern, in a way that seem intuitive and natural to SOMEONE.
* clearly state that there no real practical consequences, of choose one variant instead of another - it is mainly a matter of personal preference, and what seems more intuitively appealing.
* present the guidelines for when to use which variant as being much less firm than comes across now; use language like: if you view it this way, then you may prefer pattern 1, if that way, then P2. The way it reads now, sounds more definitive, and I think it is much more fuzzy, than is suggested in the current draft.
1.2 SPECIFIC section by section comments

ABSTRACT:
Below is a suggested rewording to focus on a requirement and how to meet it, and to avoid general abstractions that may be hard to grasp at the outset.

===
In Semantic Web languages, such as RDF and OWL a property is a binary relation used to link one individual to a single other individual or value. However, when creating ontologies, the natural and convenient way to represent certain concepts is to use n-ary relations to link an individual to more than just one individual or value. For example, the 'between' relation links an individual to the two individuals that are on either side of it. A 'diagnosis' relation may link an individual to their most likely diagnosis as well as an estimated probability for that diagnosis. However, n-ary properties are not supported in RDF or OWL. This document presents ontology patterns for representing n-ary relations and discusses what users must consider when choosing these patterns.
===

The abstract misses out this sentence, which is put in later on when first introducing pattern 1.

"How do we represent properties of a relation, such as our certainty about it, severity or strength of a relation, relevance of a relation, and so on?"

USE CASE EXAMPLES

These are great, and I love the bit explaining how these situations arise.

I was surprised that you did not use the canonical example of where you need an nary relation: between. It is another example of point 3 of how the use cases evolve.

REPRESENTATION PATTERNS
Be consistent about whether and how you distinguish linking individuals (objectProperty) vs. values (dataProperty). In the abstract you make the distinction, here you do not. Specifically, replace "links two individuals" with "links an individual to another individual or value" in the sentence:
"Each instance of a property links two individuals as shown below."

First thought is that I think it best to make it throughout the paper. Alternatively, introduce some term (e.g. like 'participant') that is neutral between individual and value, and then distinguish them when you need to. That is probably going to be easier to read.
--

[minor elaboration for clarity] Replace
"There might be other individuals 'D', 'E', and 'F', but we will assume a single additional individual for simplicity."
with
"For simplicity, we will illustrate most of our patterns assuming a single additional individual; more can be handled in exactly the same way."
---

Replace: "re-represent" with "represent". I don't think the re- helps.
---

Oh, I see you are maybe using "re-represent" instead of 'reified'. Even so, I think you can just say 'represent' and it should be clear that it is an alternative representation. 'Reify' means to: make into an object/thing, which is accurate, although it is a geeky word that readers may not relate to. Every pattern in every note 're-represents' things over and over in different patterns. Thus, the term 're-represent' is likely to cause a lot of confusion if you use it to mean an synonym of reify, and not its general meaning of representing again.
---

The term "instance of the relation" should be defined in a [nascent] glossary.
---

In: ['A', 'B', and 'C'] the quotes around b are not uniform.
---

Add some explanation as follows; Replace:

===
One common solution to this problem is to re-represent the relation as a class rather than a property. Individual instances of such classes correspond to instances of the relation. Additional properties provide binary links to each argument of the relation.
===
with
===
One common solution to this problem is to represent the n-ary relation as a class rather than as a property. Individual instances of such classes correspond to instances of the relation. For example, using this technique for the first use case, an instance of the [relation] class would represent the fact that Christine has been diagnosed with a breast tumor with high probability. For each of the n arguments of the relation, a new property is created to link the relation instance to the individual or value for each argument. For example, there would be three such properties in this case: has_diagnosis, diagnosis_value, and diagnosis_probability to link Christine, Breast_tumor and high to the diagnosis relation instance.
===

The comment about reified relations should be in a new paragraph.
---

The sentence with "converge have" does not parse.
---

For a variety of reasons, replace:
==
Depending on the relation among A, B, and C, and how the situation arose, we distinguish three patterns to represent n-ary relations in RDF and OWL: The first two patterns (pattern 1 <file:///C:\UscholdM\aa-PROJECTS\2004%20Projects\SWBPD\OEP\NARY-RELATIONS\n-aryRelations-2nd-WD.html> and pattern 2 <file:///C:\UscholdM\aa-PROJECTS\2004%20Projects\SWBPD\OEP\NARY-RELATIONS\n-aryRelations-2nd-WD.html> ) introduce a new class for an n-ary relation. These two patterns converge have the same representation from the logical point of view, although they are likely to arise in different ways and represent different modeling patterns. In these patterns the labeling of the arguments to the n-ary relation is important, but their order is not. The third pattern(pattern 3 <file:///C:\UscholdM\aa-PROJECTS\2004%20Projects\SWBPD\OEP\NARY-RELATIONS\n-aryRelations-2nd-WD.html> ) uses a list to encapsulate several arguments. This pattern is used when one (or more) of the arguments is an ordered list and the ordering is fundamentally important in the model.
==

with
==
We distinguish three patterns for representing n-ary relations in RDF and OWL, giving examples for each. We pay particular attention to the modeling situations that arise, and how they naturally map onto the different patterns.

The first pattern is useful when we wish to represent properties of a relation, such as our certainty about it, severity or strength of a relation, relevance of a relation, and so on, and where the order of the arguments is not important. Pattern 2 is useful for representing many aspects of a single event (e.g. purchasing a book). Although these two patterns are quite distinct, they are also logically equivalent. Both introduce a new class for an n-ary relation. Thus, subjective preferences about what is more or less intuitive may dictate the choice of pattern, rather than significant practical consequences.

The third pattern (pattern 3 <file:///C:\UscholdM\aa-PROJECTS\2004%20Projects\SWBPD\OEP\NARY-RELATIONS\n-aryRelations-2nd-WD.html> ) is distinguished by the fact that the order of the arguments of the n-ary relation is important in the model. This pattern uses a list to represent the argument ordering.
==
NB put back the links to pattern 1 and pattern 2 somewhere, they were lost in the cut/paste.

Can you mix patterns 1 and 2? Why or why not?

On the Differences or Equivalence of patterns 1 and 2:

If they are equivalent, why are they two different patterns?

In fact, the patterns are exactly isomorphic, by merely using the inverse function of the link from the relationship instance to the 'special' entity. For example, if you use the relation: "patient_that_was_diagnosed" instead of "diagnosis_of" then you merely turn around the arrow in the diagram, move the boxes with Christine and Breast_tumor to below the relationship instance, then PRESTO: you have pattern two.
Similar for the purchase relation. Just use the inverse of buyer, which turns the arrow around and PRESTO: you now have pattern 1. John is the special agent. The price etc are characteristics of the event.

In the example it is arguably perfectly reasonable to say that there is a diagnosis event which has three aspects: who the patient was, what the diagnosis is and the probability. This is pattern two.

I can't think of a good way to describe any REAL differences between these patterns, every time I think I have a crisp definition, it turns to mush. It is all gray. At one point I thought the difference might be:
Use P1 when you can identify on single entity with a single relationship that has different aspects.
Use P2 when there are many different relations, rather than a single relation.
---
But I don't think this quite gets it either.

There is a risk of confusing the readers by acting as if these are two significantly different patterns, when if one pushes just a little bit, you can easily transform one into the other. One person might see it more naturally in one way, another would see it the other way.
TO make it worse, we SAY they are logically equivalent, we also don't give any significant practical consequences of either pattern. The only difference is what seems more intuitively appealing. And because this can vary from person to person, it is a very minimal consequence indeed.

I don't think they are different enough to warrant a different pattern.
It makes sense to speak of a [very minor] variant which entails using the inverse function turning the arrow around. Then, one can make passing remarks about tendencies which may lead one to view one variant more as more natural than another, but this should be considerably weakened.

Also, you assert that they are logically equivalent, but do not explain it. It is far from obvious on first look. I had a long conversation with my roommate, and we could not agree for a while - eventually when I saw that it was just a matter of using an inverse function, then it seemed more likely to be logically equivalent.

SECTION: Introducing a new class for a relation
Format: several places here use underline for emphasis, others use italics. Be consistent.

In found this paragraph hard to follow. The opening discussion was hard to follow for me, [e.g. I could not relate it to the examples 1&2] it will be much harder for those unfamiliar with OWL. I recommend that the example comes first, then the abstract characterization. Abstract first is often good style for an academic paper, for best practice, I think examples first will be better. Indeed, the abstract summary is not even necessary, it is good to have for those so-inclined.
===
In the first case (pattern 1 <file:///C:\UscholdM\aa-PROJECTS\2004%20Projects\SWBPD\OEP\NARY-RELATIONS\n-aryRelations-2nd-WD.html> ), one of the individuals in the relation (say, A) is distinguished from others in that it is the subject of the relation. Just like in the case of a binary relation, where P was a property of A with value B, here the instance of the relation itself is a property value of A. This value is a complex object in itself, relating several values and individuals. Examples 1 and 2 from the list above fall under this category: Christine and Steve in these examples are individuals that the properties [WHICH PROPERTIES?] are describing. These examples commonly arise in the course of the evolution of an ontology when we discover that a simple binary relation is insufficient to represent the complexity required.
===

I disagree that: "John, books.example.com, and the Lenny_The_Lion book seem to be equally important in this purchasing relation".

I think John, as primary actor initiating the purchase has a more key role. Yet, it is true that he is not the subject of the relation either. In the diagnosis example, arguably, there is a relationship between the subject and the diagnosis. It is much less natural, in English, to think of a purchase being a 'relationship' between john and a book or a bookseller and has various other aspects like price. It is an event.

This sentence is too long, split it up:
===
If we need to represent an additional attribute describing a relation instance (example 1, Christine has breast tumor with high probability) or represent a relation instance that has different components (example 2, Steve has temperature, which is high, but falling), we can create an individual that includes the relation instance itself, as well as the additional information about this instance:
===
Also, replace word 'includes' above with 'represents' so is it consistent with wording for pattern two, which is:
"we create an individual to represent the relation instance with links to all participants"

Note the definitions you give for patterns 1 and 2
P1: "create an individual that includes the relation instance itself, as well as the additional information about this instance"
P2: "create an individual to represent the relation instance with links to all participants"

Changing the words just slightly, leaving the meaning intact, we get:
P1: create an individual that represents the relation instance itself, with links to additional information about this instance.
P2: create an individual that represents the relation instance itself, with links to all participants

So is the crucial difference between P1 and P2m that in one case there are participants and in the other there is 'additional information'? Are you viewing 'falling' as less of a participant in the relationship as you do $15? They seem about the same to me.

If so, we might as well use consistent wording, in which case the definitions for P1 and P2 are virtually identical. e.g.

I think that the difference you want is something like this:
P1: create an individual that represents the relation instance itself, with a link from the subject of the relation to this instance, and with links from the instance to all participants. [i.e. that represent additional information about this instance].
P2: create an individual that represents the relation instance itself, with links to all participants.

This reflects the only real difference: one arrow is reversed in the diagram.

Yet, there is some other discussion about independence/dependence of entities, which is also important - though it does not seem to apply consistently for only one pattern, hence cannot be the distinguisher.
===

In the description of the 2nd example of pattern 1, I suggest rewording to shift emphasis a bit.

Re this sentence:

"many will view the relationship we were representing as in a fact still a binary relation between the individual Christine and the diagnosis breast_tumor that has a probability associated with it."

Instead of saying "many will view it this way..." vaguely implying that you [the reader] probably should too, say instead: If you view in this way, then it will be natural for you to use P1, or if you view it that way, then use P2.

The emphasis here, IMHO, should be:
Here is another case which is slightly different, but for which P1 is also quite appropriate. Both of these use cases that use P1 are similar in that... and they are different in that...

Re, this sentence:
"Rather, it is a relation instance relating the individual Steve and the complex object representing different facts about his temperature."

It seems you are saying that, therefore this is the more natural intuitive way to do it. Yet, creating a special class to represent the instance, is itself a hack. Might be worth mentioning that too. Are you saying one way is a more natural hack than another?

Naming overloaded:
object and purpose are both classes and relations. Use different naming convention, like purpose_of (say).

CONSIDERATIONS when introducing a new class for a relation:

Do these considerations apply both for P1 and for P2? Do they help me decide whether to use P1 or P2?

These seem to me more 'facts of interest' rather than practical considerations that actually have consequences.

I really like the discussion of independent/dependent objects, but after I'm done, I don't see how it actually matters. How is the reader to make use of this information?

Also, I don't think it is even true, that this will be a clear distinguishing feature of P1 vs. P2. At best, you might say, for P1 it is often this way, and for P2 it is often that way. But then, how is it important? Might it cause even more confusion?

I am completely unable to follow the discussion about inverses. IT seems like it might be interesting, but it may need a longer example to explain it adequately.

In the sentence: "A more principled approach to the distinction is that the difference is "ontological" rather than logical." say what 'difference' you are referring to. Also, this wording is a bit awkward.

The terms makes sense or no sense don't seem good ones. Say instead, can exist, or not?

This wording is a bit awkward and hard to follow:
"One practical consequence of this difference is that we are unlikely to be interested in the inverses to the links to the dependent entities in pattern 1, whereas we are to the independent entities in pattern 2 <file:///C:\UscholdM\aa-PROJECTS\2004%20Projects\SWBPD\OEP\NARY-RELATIONS\n-aryRelations-2nd-WD.html> . "

The discussion of meaningful names is also interesting, but how does it matter? It is just for the interest of the reader? It is a recommendation of what they should or should not do? Probably this is a matter of best practice, to NOT introduce names when it could be misleading to do so.

Include reference for this sentence:
"Note that a similar approach is taken when reifying statements in RDF."

In the sentence: "Creating a class to represent an n-ary relation limits the use of many OWL constructs and creates a maintenance problem."
* Why is 'constructs' in italics?

In the discussion that follows this sentence,
* relate the problem to how it "limits the use of OWL"
* do the example first, then do the abstract characterization
* replace: "a lattice" with "an explicit lattice"
* is this a problem that helps decide whether to use P1 or P2,or does it apply the same to both?

The discussion on unintended models is too terse. Also, it does not seem to be the same kind of 'unintended models' that I'm familiar with.

IN the phrase: "and object in the model" does the word model have the same meaning as model in 'unintended model'? If so say so, if not try to avoid the ambiguity.

In the sentence: "

If you create a relationship called brother(X,Y) and all you say about it is that it is transitive. Then an unintended model of this is that brother represents the relationship taller_than. I cannot relate the discussion here to this sense of unintended model.

Breast is misspelled, missing an s.

It seems an odd example, why on earth would one assert the same triple over and over???

I don't get why it matters if there are new sets of triples? Will I get wrong inferences? If so, give an example.

Suggest: Replace 'should' below with 'must' or 'will need to' :
"Each of the properties participating in the n-ary relation should have its own inverse property"
--

Figure with red & black arrows has no number.
Also the relation called 'object' should probably have a name like 'has_object" to remove meaning ambiguity.

'agent' should be 'buyer'
'recipient' should also be 'buyer'
--

I found this definition hard to understand, English would help.

:Person
a owl:Class ;
rdfs:subClassOf
[ a owl:Restriction ;
owl:onProperty :is_buyer_for ;
owl:allValuesFrom :Purchase
] .
--

Pattern 3

'participants' should be singular.
"In cases where all but one participants in a relation"

RDF occurs in a few places where I think you mean RDF Schema.

There should be N3 code here, just like in all other examples.
--

"The down-side of this approach".
"in general is a bad idea"
I like such value judgments, but will others disagree?
In any event, clearly state the consequences of the choice.
Also, in this case, the bad idea is not so much bad, as that it will not serve the desired purpose.
--

The paragraph "N-ary relations and reification in RDF" was unclear to me, an example would help.
--

This paragraph I did not get on first reading, then I thought I got it, but did not understand why it mattered. Again, an example might help.
"However, formally, we interpret properties as representing relations, i.e. sets of ordered pairs of individuals. Each instance of a relation is just one of those ordered pairs. The "Property" in each triple is fundamentally different from the individuals in the triple. It merely indicates to which relation the ordered pair consisting of the two individuals belongs. We normally name individuals; we do not normally name the ordered pairs."
--

'sane' --> 'same'
--

In the paragraph: "Anonymous vs. named instances in these pattern" please explain why you use Bnodes in the prior examples. I don't see the difference that you are getting at as to when to use bnodes and when not.
--

Notes: most of these have useful important germane points. I suggest putting them back in the text, if you can.

"ref to be added" occurs twice, don't forget.
--

Received on Monday, 23 May 2005 17:59:10 UTC