Defining N-ary Relations on the Semantic Web

W3C Working Draft 24 January 2005

This version:: ...
Latest version:: ...
Previous versions:: This is the first public version
Editors:: Natasha Noy, Stanford University; Alan Rector, University of Manchester

Abstract

In Semantic Web languages, such as RDF and OWL, a property is a binary relation: it is used to link two individuals or an individual and a value. How do we represent relations among more than two individuals? How do we represent properties of a relation, such as our certainty about it, severity or strength of a relation, relevance of a relation, and so on? The document presents ontology patterns for representing n-ary relations and discusses what users must consider when choosing these patterns.

Status of this Document

This section describes the status of this document at the time of its publication. Other documents may supersede this document. A list of current W3C publications and the latest revision of this technical report can be found in the W3C technical reports index at http://www.w3.org/TR/.

This document will be a part of a larger document that will provide an introduction and overview of all ontology design patterns produced by the Semantic Web Best Practices and Deployment Working Group.

This document is a W3C Working Draft and is expected to change. The SWBPD WG does not expect this document to become a Recommendation. Rather, after further development, review and refinement, it will be published and maintained as a WG Note.

This document is a Public Working Draft. We encourage public comments. Please send comments to public-swbp-wg@w3.org

Open issues, todo items:

Different namespaces for A-Box and T-Box¹

Publication as a draft does not imply endorsement by the W3C Membership. This document is a draft and may be updated, replaced or made obsolete by other documents at any time. It is inappropriate to cite this document as other than work in progress.

General issues

In Semantic Web languages, such as RDF and OWL, a property is a binary relation: instances of properties link two individuals. Often we refer to the second individual as the "value" or to both both individuals as "arguments" [See note on vocabulary].

Issue 1: If property instances can link only two individuals, how do we deal with cases where we need to describe the instances of relations, such as its certainty, strength, etc?

Issue 2: If instances of properties can link only two individuals, how do we represent relations among more than two individuals? ("n-ary relations")

Issue 3: If properties can link only two individuals, how do we represent relations in which one of the participants is an ordered list of individuals rather than a single individual?

The solutions to the first two problems are closely linked; the third problem is fundamentally different, although it can be adapted to meet issue one in special cases.

Use case examples

Several common use cases fall under the category of n-ary relations. Here are some examples:

Christine has breast tumor with high probability. There is a binary relation between the person Christine and diagnosis breast_tumor and there is a qualitative probability value describing this relation (high).
Steve has temperature, which is high, but falling. The individual Steve has two values for two different aspects of a has_temperature relation: its magnitude is high and its trend is falling.
John buys a "Lenny the Lion" book from books.example.com for $15 as a birthday gift. There is a relation, in which individual John, entity books.example.com and the book Lenny_the_Lion participate. This relation has other components as well such as the purpose (birthday_gift) and the amount ($15).
United Airlines flight 3177 visits the following airports: LAX, DFW, and JFK. There is a relation between the individual flight and the three cities that it visits, LAX, DFW, JFK. Note that the order of the airports is important and indicates the order in which the flight visits these airports.

Another way to think about the use cases is how they might occur in the evolution of an ontology.

We discover that a relation that we thought was binary, really needs a further argument - a common origin of use case 1.
We discover that two binary properties always go together and should be represented as one n-ary relation - a common origin for use case 2
From the beginning, we realise that the relation is really amongst several things - a common origin for use case 3
The nature of the relation is such that one or more of the arguments is fundamentally a sequence rather than a single individual - use case 4.

Representation patterns

As we describer earlier, in Semantic Web Languages, properties are binary relations. Each instance of a property links two individuals as shown below.

Property P relating resources A and B

We would like to have another individual or simple value C to be part of this relation instance:

Property P relating resources A, B, and C

'P' now refers to an instance of a relation among 'A', 'B', and 'C'. (There might be other individuals 'D', 'E', and 'F', but we will assume a single additional individual for simplicity.)

One common solution to this problem is to re-represent the relation as a class rather than a property. Individual instances of such classes correspond to instances of the relation. Addtional properties provide binary links to each argument of the relation. Ontologically such classes are often called "reified relations". Reified relations play important roles in many ontologies² (e.g. Ontoclean/DOLCE, Sowa, GALEN). However, note that the RDF and Topic Map communities have each used the word "reify" to mean other things (see the note below). Therefore, to avoid confusion in this document, we usually speak of "re-representation" rather than "reification".

A second solution is to represent several individuals participating in the relation as a collection or an ordered list.

Depending on the relation among A, B, and C, and how the situation arose, we distinguish three patterns to represent n-ary relations in RDF and OWL: The first two patterns (pattern 1 and pattern 2) introduce a new class for an n-ary relation. These two patterns converge have the same representation from the logical point of view, although they are likely to arise in different ways and represent different modeling patterns. In these patterns the labelling of the arguments to the n-ary relation is important, but their order is not. The third pattern(pattern 3) uses a list to encapsulate several arguments. This pattern is used when one (or more) of the arguments is an ordered list and the ordering is fundamentally important in the model.

Introducing a new class for a relation

We present two patterns where we create a new class and n new properties to represent an n-ary relation. An instance of the relation linking the n individuals is then an instance of this class. Note that while these two patterns are equivalent logically, they provide two different viewpoints and users may find one or the other more convenient in a given situation (see Considerations when introducing a new class for a relation).

In the first case (pattern 1), one of the individuals in the relation (say, A) is distinguished from others in that it is the subject of the relation. Just like in the case of a binary relation, where P was a property of A with value B, here the instance of the relation itself is a property value of A. This value is a complex object in itself, relating several values and individuals. Examples 1 and 2 from the list above fall under this category: Christine and Steve in these examples are individuals that the properties are describing. These examples commonly arise in the course of the evolution of an ontology when we discover that a simple binary relation is insufficient to represent the complexity required.

In the second case (pattern 2), the n-ary relation represents a network of participants that all play different roles in the relation, but two or more of the participants have equal "importance" in the relation. Example 3 above would usually fall into this category: At least John, books.example.com, and the Lenny_The_Lion book seem to be equally important in this purchasing relation.

Pattern 1:

If we need to represent an additional attribute describing a relation instance (example 1, Christine has breast tumor with high probability) or represent a relation instance that has different components (example 2, Steve has temperature, which is high, but falling), we can create an individual that includes the relation instance itself, as well as the additional information about this instance:

pattern 1

For the example 1 above (Christine has breast tumor with high probability), the individual Christine has a property has_diagnosis that has another object (_:Diagnosis_Relation_1, an instance of the class Diagnosis_Relation) as its value:

Diagnosis example

The individual _:Diagnosis_Relation_1 here represents a single object encapsulating both the diagnosis (breast_tumor) and the probability of the diagnosis (HIGH)². It contains all the information held in the original 3 arguments: who is being diagnosed, what the diagnosis is, and what the probability is. We use blank nodes in RDF to represent instances of a relation.

:Christine
      a       :Person ;
      :has_diagnosis _:Diagnosis_Relation_1 .

:_Diagnosis_relation_1
      a       :Diagnosis_Relation ;
      :diagnosis_probability :HIGH;
      :diagnosis_value :Breast_Tumor .

Each of the 3 arguments in the original n-ary relation—who is being diagnosed, what the diagnosis is, and what the probability is—gives rise to a true binary relationship. In this case, there are three: has_diagnosis, diagnosis_value and diagnosis_probability.³

The class definitions for the individuals in this pattern look as follows:

Classes in the Diagnosis example

The additional labels on the links indicate the OWL restrictions on the properties. We define both diagnosis_value and diagnosis_probability as functional properties, thus requiring that each instance of Diagnosis_Relation has exactly one value for Disease and one value for Probability.

In RDFS, which does not have the OWL restrictions or functional properties, the links represent rdfs:range constraints on the properties. For example, the class Diagnosis_Relation is the range of the property has_diagnosis.

Here is a definition of the class Diagnosis_Relation in OWL, assuming that both properties—diagnosis_value and diagnosis_probability—are defined as functional (we provide full code for the example in OWL and RDFS below):

:Diagnosis_Relation
      a       owl:Class ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:someValuesFrom :Disease ;
                owl:onProperty :diagnosis_value
              ] ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:allValuesFrom :Probability_values ;
                owl:onProperty :diagnosis_probability
              ] .

In the definition of the Person class (of which the individual Christine is an instance), we specify a property has_diagnosis with the range restriction going to the Diagnosis_Relation class (of which Diagnosis_1 is an instance):

:Person
      a       owl:Class ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:allValuesFrom :Diagnosis_Relation ;
                owl:onProperty :has_diagnosis
              ] .

RDFS code for this example

[RDFS]

OWL code for this example

[N3] [RDF/XML]

We have a different use case in the example 2 above (Steve has temperature, which is high, but falling): In the example with the diagnosis, many will view the relationship we were representing as in a fact still a binary relation between the individual Christine and the diagnosis breast_tumor that has a probability associated with it. The relation in this example is between the individual Steve and the object representing different aspects of the temperature he has. In most intended interpretations, this instance of a relation cannot be viewed as an instance of a binary relation with additional attributes attached to it. Rather, it is a relation instance relating the individual Steve and the complex object representing different facts about his temperature. Such cases often come about in the course of evolution of an ontology when we realize that two relations need to be collapsed. For example, initially, we might have had two properties—has_termperature_level and has_temperature_trend—both relating to people. We might then have realized that these properties really are inextricably intertwined because we need to talk about "termperatures that are elevated but falling."

Temperature example for pattern 1

The RDFS and OWL patterns that implement this intuition are however the same as in the previous example. A class Person (of which the individual Steve is an instance) has a property has_temperature which has as a range the relation class Temperature_Relation. Instances of the class Temperature_Relation (such as _:Termperature_Relation_1 in the figure) in turn have properties for temperature_value and temperature_trend.

RDFS code for this example

[RDFS]

OWL code for this example

[N3] [RDF/XML]

Pattern 2:

In some cases, the n-ary relationship represents a network of individuals that play different roles in a structure without any single individual standing out as the subject or the "owner" of the relation, such as Purchase in the example 3 above (John buys a "Lenny the Lion" book from books.example.com for $15 as a birthday gift). Here, the relation explicitly has more than one participant, and, in many contexts, none of them can be considered a primary one. In this case, we create an individual to represent the relation instance with links to all participants:

Pattern 2

In our specific example, the representation will look as follows:

Purchase example

Purchase_1⁵ is an individual instance of the Purchase class representing an instance of a relation:⁶

:Purchase_1
      a       :Purchase ;
      :buyer :John ;
      :object :Lenny_The_Lion ;
      :purpose :Birthday_Gift ;
      :seller :books.example.com .

The following diagram shows the corresponding classes and properties. For the sake of the example, we specify that each purchase has exactly one buyer (a Person), exactly one seller (a Company), exactly one amount and at least one object (an Object).

Classes for the Purchase example

The diagram refers to OWL restrictions. In RDFS the arrows can be treated as rdfs:range links.

The class Purchase is defined as follows in OWL (see the RDFS file below for the definition in RDFS):

:Purchase
      a       owl:Class ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:allValuesFrom :Purpose ;
                owl:onProperty :purpose
              ] ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:cardinality 1 ;
                owl:onProperty :buyer
              ] ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:onProperty :buyer ;
                owl:someValuesFrom :Person
              ] ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:cardinality 1 ;
                owl:onProperty :seller
              ] ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:onProperty :seller ;
                owl:someValuesFrom :Company
              ] ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:onProperty :object ;
                owl:someValuesFrom :Object
              ] .

RDFS code for this example

[RDFS]

OWL code for this example

[N3] [RDF/XML]

Considerations when introducing a new class for a relation

The two patterns for introducing a new class for a relation are logically equivalent. However, one or the other may seem more natural in specific situations. In many cases, the choice between the two patterns above is subjective.
In the first pattern, we have essentially a binary relation with an additional property on it. We re-represent this relation as a class. In the second pattern, we have a relation that indeed has several participants. In this case, we represent it as a network of binary relations.
A more principled approach to the distinction is that the difference is "ontological" rather than logical. In the first pattern, there is a relation between two independent entities or between one independent and two or more dependent entities. Christine and her breast tumour are independent things that make sense on their own. The probability is dependent; probabilities make no sense without something to be a probability of. In the second example, there is a relation between a single independent entity and a "quality" with two different aspects. In pattern 2, by contrast, there are at least three independent entities - John, boxes.example.com, and Lenny_the_Lion. (We won't discuss the status of birthday_gift here.)
- One practical consequence of this difference is that we are unlikely to be interested in the inverses to the links to the dependent entities in pattern 1, whereas we are to the independent entities in pattern 2.
- A second practical consequence is that we will likely represent the dependent entities—Probability, Elevated and Falling—as one of a set of specified values (See note on Representing Specified Values in OWL.)
In our example, we did not give meaningful names to instances of properties or to the classes used to represent instances of n-ary relations, but merely label them _:Temperature_Relation_1, Purchase_1, etc. In most cases, these individuals do not stand on their own but merely function as auxiliaries to group together other objects. Hence a distinguishing name serves no purpose. Note that a similar approach is taken when reifying statements in RDF.
OWL allows definition of inverse properties. Defining inverse properties with n-ary relations, using any of the patterns above, requires more work than with binary relations. Each of the properties participating in the n-ary relation should have its own inverse property (with the proper constraints). Consider the example of John buying the Lenny_The_Lion book. We may want to have an instance of an inverse relation pointing from the Lenny_The_Lion book to the person who bought it. If we had a simple binary relation John buys Lenny_The_Lion, defining an inverse is simple: we simply define a property is_bought_by as an inverse of buys:
```
:is_bought_by
      a       owl:ObjectProperty ;
      owl:inverseOf :buys .
```
With the purchase relation represented as an instance, however, we need to add inverse relations between participants in the relation and the instance relation itself:

For example, the definitions of the inverse relations for for agent and object of a purchase, look as follows:
```
:is_buyer_for
      a       owl:ObjectProperty ;
      owl:inverseOf :buyer .
:is_object_for
      a       owl:ObjectProperty ;
      owl:inverseOf :has_object .
```
And the definition of the Person class (taking into account the inverse for the recipient property) is:
```
:Person
      a       owl:Class ;
      rdfs:subClassOf
              [ a       owl:Restriction ;
                owl:onProperty :is_buyer_for ;
                owl:allValuesFrom :Purchase
              ] .
```
Note that the value of the inverse property is_buyer_for for the individual John, for example, is the individual Purchase_1 rather than the object or recipient of the purchase.

Using lists for arguments in a relation

Some n-ary relations do not naturally fall into either of the two patterns above, but are more similar to a list or sequence of arguments. The example 4 above (United Airlines flight 3177 visits the following airports: LAX, DFW, and JFK) falls into this category. In this example, the relation holds between the flight and the airports it visits, in the order of the arrival of the aircraft at each airport in turn. This relation might hold between many different numbers of arguments, and there is no natural way to break it up into a set of distinct properties relating the flight to each airport. At the same time, the order of the arguments is highly meaningful.

Pattern 3:

In cases where all but one participants in a relation do not have a specific role and essentially form a list, it is natural to connect the airport arguments into a sequence and to relate the flight to this sequence. We represent the sequence as a list, where each list item points to its content and to the rest of the list:

Temperature example for pattern 1

RDF in fact supplies a vocabulary for just this purpose—the collection vocabulary. Thus, implementation of this pattern in RDF is straightforward: We simply use rdf:List for this purpose. Individuals List_1, List_2, List_3 and Empty_List are instances of rdf:List; the property has_contents is, in fact, the property rdf:first and the property rest_of_list is simply rdf:rest. We provide the full RDFS code for this example.

We can use the same rdf:List construct in OWL. However, using rdf:List this way in OWL puts the ontology in OWL Full. If we want to keep the ontology in OWL-DL, we can explicitly define the properties in the figure above in our OWL ontology:

Classes for Argument list pattern

RDFS code for this example

[RDFS]

OWL code for this example

[N3] [RDF/XML]

Considerations when using a list of arguments in a relation

This technique requires only two properties—rdf:first and rdf:rest or has_contents and rest_of_list—(and the empty-list individual) for any number of arguments and any number of instances, and it permits some elegant techniques for manipulating the sequences. These patterns are commonly used in programming, and have been called S-expressions, or linked list structures.
The down-side of this approach, obvious from the figure, is that this technique uses a lot more triples and extra 'blank' nodes to encode a single instance of an n-ary relation.
If you do not need to stay in OWL DL, you do not need to create the new class and properties for handling lists and can simply use rdf:List directly.

N-ary relations and reification in RDF

It may be natural to think of RDF reification when representing n-ary relations. Using the RDF reification vocabulary to represent n-ary relations in general is a bad idea. The RDF reification vocabulary is designed to talk about statements—individuals that are instances of rdf:Statement. A statement is a object, predicate, subject triple and reification in RDF is used to put additional information about this triple. This information may include the source of the information in the triple, for example. In n-ary relations, however, additional arguments in the relation do not usually characterize the statement but rather provide additional information about the relation instance itself. Thus, it is more natural to talk about instances of a diagnosis relation or a purchase rather than about a statement.

Additional Background

Note on vocabulary: Relations and instances of relations, Properties and Property instances

We usually think of semantic web languages as consisting of triples of the form "Individual1-Property-Individual2" (Traditionally, these have been termed "object-attribute-value" triples, but we do not use this language here because it conflicts with RDF usage.)

However, formally, we interpret properties as representing relations, i.e. sets of ordered pairs of individuals. Each instance of a relation is just one of those ordered pairs. The "Property" in each triple is fundamentally different from the individuals in the triple. It merely indicates to which relation the ordered pair consisting of the two individuals belongs. We normally name individuals; we do not normally name the ordered pairs.

Anonymous vs named instances in these patterns

Often in cases such as pattern 1, we wish to regard two instances of the relation that have the sane argument as equivalent. We can capture this intuition by using RDF blank nodes (e.g., _:Diagnosis_relation) to represent relation instances.In pattern 2, we wish to consider the possibility that there might be two distinct purchases with identical arguments. In that case, the node should be named, e.g. Purchase_1.

Notes

http://lists.w3.org/Archives/Public/public-swbp-wg/2004Jul/0009.html
"Reified relations" play and important role or have a special status in a number of ontologies, e.g. see Sowa, J. Knowledge Representation. Morgan Kaufmann, 1999; Welty, C. and Guarino, N. Supporting ontological analysis of taxonomic relationships. Data and Knowledge Engineering, 39 (1). 51-74.
For simplicity, we represent each disease as an individual. This decision may not always be appropriate, and we refer the reader to a different note (ref to be added). Similarly, for simplicity, in OWL we represent probability values as a class that is an enumeration of three individuals (HIGH, MEDIUM, and LOW):
```
:Probability_values
      a       owl:Class ;
      owl:equivalentClass
              [ a       owl:Class ;
                owl:oneOf (:HIGH :MEDIUM :LOW)
              ] . 
```
There are other ways to represent partitions of values. Please refer to a note on Representing Specified Values in OWL [Specified Values ]. In RDF Schema version, we represent them simply as strings, also for simplicity reasons.
RDF has a property rdf:value that is appropriate in examples such as the Diagnosis example here. While rdf:value has no meaning on its own, RDF specification encourages its use as a vocabulary element to identify the "main" component of a structured value of a property. Therefore, in our example, we made diagnosis_value a subproperty of rdf:value property instead of diagnosis_value property to indicate that diagnosis_value is indeed the "main" component of a diagnosis.
Note that we used a named individual for an instance of the class Purchase (Purchase_1) rather than an anonymous blank node here. In this example, there might be two distinct purchases with exactly the same arguments.
For simplicity, we will ignore the fact that the amount is expressed in $ and will use a simply number as the value for the property. For a discussion on how to represent units and quantities in OWL, please refer to a different note (ref to be added)

References

[Specified Values]: Representing Specified Values in OWL: "value partitions" and "value sets", Alan Rector, Editor, W3C Working Draft, 3 August 2004, http://www.w3.org/TR/swbp-specified-values/ .
[OWL Overview]: OWL Web Ontology Language Overview, Deborah L. McGuinness and Frank van Harmelen, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-owl-features-20040210/ . Latest version available at http://www.w3.org/TR/owl-features/ .
[OWL Guide]: OWL Web Ontology Language Guide, Michael K. Smith, Chris Welty, and Deborah L. McGuinness, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-owl-guide-20040210/ . Latest version available at http://www.w3.org/TR/owl-guide/ .
[OWL Semantics and Abstract Syntax]: OWL Web Ontology Language Semantics and Abstract Syntax, Peter F. Patel-Schneider, Patrick Hayes, and Ian Horrocks, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-owl-semantics-20040210/ . Latest version available at http://www.w3.org/TR/owl-semantics/ .
[RDF Primer]: RDF Primer, Frank Manola and Eric Miller, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-primer-20040210/ . Latest version available at http://www.w3.org/TR/rdf-primer/ .
[RDF Semantics]: RDF Semantics, Pat Hayes, Editor, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-mt-20040210/ . Latest version available at http://www.w3.org/TR/rdf-mt/ .
[RDF Vocabulary]: RDF Vocabulary Description Language 1.0: RDF Schema, Dan Brickley and R. V. Guha, Editors, W3C Recommendation, 10 February 2004, http://www.w3.org/TR/2004/REC-rdf-schema-20040210/ . Latest version available at http://www.w3.org/TR/rdf-schema/ .

Changes

Added more discussion to General issues and Use cases
Added pattern 3
Added the flight example
Changed the wording under "Representation Pattern"
Use blank nodes for relation instances in pattern 1 and pattern 2
Added a section on N-ary relations and reification in RDF
Added a section on Additional background
Added references
Changed some of references to "relation" to "relation instance" or "instance of relation"
Removed examples in abstract syntax
Added Acknowledgements

Acknowledgements

The editors would like to thank the following Working Group members for their contributions to this document: Pat Hayes, Jeremy Carroll, Chris Welty, Michael Uschold, Bernard Vatant. Frank Manola, Ivan Herman, Jamie Lawrence have also contributed to the document.

This document is a product of the Ontology Engineering and Patterns Task Force of the Semantic Web Best Practices and Deployment Working Group.