Formal Specification of the OWL Web Ontology Language

Authors:: Peter F. Patel-Schneider
Ian Horrocks
Frank van Harmelen

First version: 19 March 2002
Previous version: 12 June 2002
This version: 14 June 2002

Abstract

The OWL Web Ontology Language is being designed by the W3C Web Ontology Working Group as a revision of the DAML+OIL web ontology language. This description of OWL contains an abstract syntax for OWL, which serves as a high-level specification for the formalism.

Introduction
1. Differences from DAML+OIL
Abstract Syntax for OWL
Abstract Syntax for OWL Lite

1. Introduction

The W3C Web Ontology Working Group (WebOnt) is tasked with producing a web ontology language extending the reach of XML, RDF, and RDF Schema. This language, called OWL, will be based on the DAML+OIL web ontology language.

A number of problems have been discovered in the design of DAML+OIL, mostly having to do with its relationship with the changes to RDF being undertaken by the W3C RDF Core Working Group.

This document contains a high-level description of the features that will be in OWL, in the form of abstract constructs and informal descriptions of the meaning of these constructs.

This document contains neither a presentation syntax nor an exchange syntax for OWL. An XML presentation syntax may be defined, as an XML schema. The official exchange syntas for OWL is RDF triples; a document defining how RDF triples are used to encode OWL is the subject of a different document. This document also does not contain a formal semantics for OWL, but one will be provided.

1.1. Differences from DAML+OIL

The language described here is very close to DAML+OIL. The abstract syntax can be viewed as an abstract syntax for DAML+OIL. It can easily be transformed into DAML+OIL.

The only substantive changes from DAML+OIL are

the removal of qualified number restrictions, per a decision of the WG;
the ability to directly state that properties can be symmetric, per a decision of the WG; and
the removal of some unusual DAML+OIL constructs, particularly restrictions with extra components.

Readers should assume that anything that can be stated in this abstract syntax will end up in OWL, and in a manner maximally compatible with DAML+OIL. There are also a number of minor differences, including a number of changes to the names of the various constructs. These naming changes may indicate potential changes to the preferred names in the concrete syntax for OWL, but the intent of WebOnt is to maintain the DAML+OIL names to the maximum extent reasonable.

2. Abstract Syntax

The description of the language here abstracts from concrete syntax and thus facilities access to and evaluation of the language. A high-level abstract syntax is used to make the language features easier to see. This particular abstract syntax has a frame-like style, where a collection of information about a class or property is given in one large syntactic construct, instead of being divided into a number of atomic chunks (as in most Description Logics) or even being divided into even more triples (as in DAML+OIL), again for ease of readability. The syntax used here is rather informal, even for an abstract syntax - in general the arguments of a construct should be considered to be unordered whereever the order would not affect the meaning of the construction.

This abstract syntax does not have to worry about any of the problems induced by the RDF triple model, including non-closed and ill-formed lists and restrictions. No parsetype extensions are needed for readability, and issues of coordination with the RDF Core WG are not active at this level of syntax. Layering issues can also be safely ignored.

The abstract syntax is specified here by means of a version of Extended BNF. In this version of BNF, terminals are not quoted, non-terminals are enclosed in pointy brackets (<...>), alternatives are either separated by vertical bars (|) or are given in different productions. Elements that can occur zero or one times are enclosed in square brackets ([...]). Elements that can occur zero or more times are enclosed in braces ({...}).

The formal meaning of OWL constructs will be defined elsewhere, but some indications of this meaning will be given in this document. The meaning of OWL is given via a model-theoretic semantics, which is an extension of the model theory for RDF. In the OWL model theory there is a domain of discourse, consisting of resources, which is disjoint from the set of XML Schema data values. The OWL model theory also provides meaning for classes and descriptions, which denote sets of individuals; datatypes, which denote their XML Schema Datatypes value space; and properties, which denote sets of pairs of individuals or sets of pairs of individuals and data values. (The term ``individual'' is used here, as individuals may not exactly correspond to RDF resources.)

2.1. Ontologies

An OWL ontology is a sequence of axioms and facts, plus inclusion references to other ontologies, which are considered to be included in the ontology. OWL ontologies are web documents, and can be referenced by means of a URI. Ontologies also have a non-logical component (not yet specified) that can be used to record authorship, and other non-logical information associated with a ontology.

<ontology> ::= Ontology ( [<authorship-etc>] {<directive>} )

<authorship-etc> ::= ...

<directive> ::= <include>
<directive> ::= <axiom>
<directive> ::= <fact>

<include> ::= Include ( <URI> )

Ontologies incorporate information about classes, properties, and individuals, each of which can have an ID which is a qualified name. (Actually, IDs may be end up being URI references.) Ontologies can also reference XML Schema datatypes, by means of a name for the datatype.

<datatypeID>                 ::= <name>
<classID>                    ::= <name>
<individualID>               ::= <name>
<datavaluedPropertyID>       ::= <name>
<individualvaluedPropertyID> ::= <name>

If a name is a datatype, i.e., if there is a datatype definition retrievable using the name, then that name cannot be used as the ID for a class. However, a name can be the ID of a class or datatype as well as the ID of a property as well as the ID of an individual. Individual IDs are used to refer to resources, and typed or untyped data literals are used to refer to the XML Schema data values.

In OWL a datatype denotes the set of XML Schema data values that is the value space for the datatype. Classes denote sets of individuals. Properties relate individuals to other information, and are divided into two disjoint groups, data-valued properties and individual-valued properties. Elements of the first group of properties relate individuals to data values, elements of the second group relate individuals to other individuals.

2.2. Axioms

Axioms are used to associate class and property IDs with either partial or complete specifications of their characteristics, and to give other logical information about classes and properties. These used to be called definitions, but they are not all definitions in the common sense of the term, as has been made evident in several discussions in the WG, and thus a more-neutral name has been chosen.

The abstract syntax used here for classes is meant to look somewhat like the syntax used in some frame systems. Each class axiom contains a collection of more-general classes; a collection of local property restrictions, in the form of restriction constructs; and a collection of descriptions. The restriction construct gives the local range of a property, how many values are permitted, and a collection of required values. Descriptions are used to specify boolean combinations of restrictions and other descriptions as well as construct sets of individuals. Classes can also be specified by enumeration or be made the same or disjoint.

Properties can be the equivalent to or subproperties of others; can be made functional, inverse functional, or transitive; and can be given global domains and ranges. However, most information about properties is more naturally expressed in restrictions, which allow local cardinality and range information to be specified.

There is no requirement that there be an axiom for each class used in an ontology. Properties used in an ontology have to be categorized as either data-valued or individual-valued, so they need an axiom for this purpose at least. There is no requirement that there be at most one axiom for a class or property used in an ontology. Each axiom for a particular class (or property) name contributes to the meaning of the class (or property).

2.2.1 Class Axioms

The following axiom states either that a class is exactly equivalent to, for the modality complete, or a subclass of, for the modality partial, the conjunction of a collection of descriptions, which can include superclasses and property restrictions.

<axiom> ::= Class( <classID> <modality> {<description>} )
<modality> ::= complete | partial

It is also possible to make a class exactly consist of a certain set of individuals, as follows.

<axiom> ::= EnumeratedClass( <classID> {<individualID>} )

Finally, it is possible to require that a collection of descriptions have the same members, or to be pairwise disjoint, or that one description is a subclass of another. Note that the last two of these axioms generalize the first two class axioms just above.

<axiom> ::= DisjointClasses( <description> {<description>} )
<axiom> ::= EquivalentClasses( <description> {<description>} )
<axiom> ::= SubClassOf( sub=<description>  super=<description> )

2.2.2 Property Axioms

Properties are also specified using a frame-like syntax. Properties are divided into data-valued properties, which relate individuals to data values, like integers, and individual-valued properties, which relate individuals to other individuals. Properties can be given superproperties, allowing the construction of property hierarchy. Individual properties cannot be superproperties of data properties.

Properties can also be given domains and ranges. A domain for a property specifies which individuals are potential subjects of statements that have the property as verb, just as in RDF. The domains of properties are descriptions. Properties can have multiple domains, in which case only individuals that belong to all of the domains are potential subjects. A range for a property specifies which individuals or data values can be objects of the property. Again, properties can have multiple ranges, in which case only individuals or data values that belong to all of the ranges are potential objects. Ranges for individual-valued properties are descriptions; ranges for data-valued properties are datatypes or sets of data values.

Data-valued properties can be specified as (partial) functional, i.e., there is at most one relationship for that property between a given individual and a data value. Individual-valued properties can be specified as functional, inverse functional, or one-to-one. Individual-valued properties can be specified to be the inverse of another property. Finally, individual-valued properties can be specified as transitive. Individual-valued properties that are transitive, or that have transitive sub-properties, may not have cardinality conditions expressed on them, either in restrictions or by being functional, inverse functional, or one-to-one. This is necessary in order to maintain the decidability of the language.

<axiom> ::= DataProperty ( <datavaluedPropertyID> {super=<datavaluedPropertyID>}
                           {domain=<description>} {range=<dataRange>}
                           [Functional] )

<axiom> ::= IndividualProperty 
        ( <individualvaluedPropertyID> {super=<individualvaluedPropertyID>}
          {domain=<description>} {range=<description>} 
          [inverseOf=<individualvaluedPropertyID>] [Symmetric] 
          [Functional | InverseFunctional | OneToOne | Transitive] )

A dataRange, i.e., the range of a data-valued property, is either a datatype or a set of data values.

<dataRange> ::= <datatypeID>
<dataRange> ::= OneOf({<dataLiteral>} )

In this syntax data literals consist either of a datatype and the lexical representation of a data value in that datatype (a typed data literal), or just the lexical representation of a data value (an untyped data literal). Allowing untyped data literals introduces some problems to the formalism, and care has to be taken here.

<dataLiteral> ::= <datatypeID>  <lexical-form>
                | <lexical-form>

The following axioms make several properties be the same, or make one property be a sub-property of another.

<axiom> ::= EquivalentProperties( <datavaluedPropertyID>  {<datavaluedPropertyID>} )
<axiom> ::= SubPropertyOf( <datavaluedPropertyID>  <datavaluedPropertyID> )
<axiom> ::= EquivalentProperties( <individualvaluedPropertyID>  {<individualvaluedPropertyID>} )
<axiom> ::= SubPropertyOf( <individualvaluedPropertyID>  <individualvaluedPropertyID> )

2.2.3 Descriptions

Descriptions include class IDs and the restriction constructor. Descriptions can also be boolean combinations of other descriptions, and sets of individuals.

<description> ::= <classID>
                | <restriction>
                | UnionOf( <description> {<description>} )
                | IntersectionOf( <description> {<description>} )
                | ComplementOf( <description> )
                | OneOf({<individualID>} )

2.2.4 Restrictions

Restrictions are used in class axioms to provide local constraints on properties in the class. The allValuesFrom part of a restriction makes the constraint that all values of the property for object in the class must belong to the specified class or data range. Each someValueFrom part makes the constraint that there must be at least one value for the property that belongs to the specified class or datatype range. Each value part makes the constraint that the individual or data value must be a value for the property. The cardinality part says how many distinct values there are for the property for each individual in the class. Properties that are transitive, or that have transitive sub-properties, may not have cardinality conditions expressed on them in restrictions.

<restriction> ::= restriction( <datavaluedPropertyID> [allValuesFrom=<dataRange>]
                               {someValueFrom=<dataRange>} {value=<dataLiteral>}
                               [<cardinality>] )
<restriction> ::= restriction( <individualvaluedPropertyID> [allValuesFrom=<description>]
                               {someValueFrom=<description>} {value=<individual>}
                               [<cardinality>] )
<cardinality> ::= atleast( <positive-integer> )
                | atmost( <non-negative-integer> )
                | atleast( <positive-integer> ) atmost( <non-negative-integer> )
                | exactly( <non-negative-integer> )

2.3. Facts

The first kind of fact states information about a particular individual, in the form of classes that the individual belongs to plus properties and values of that individual. An individual can be given an individualID that will denote that individual, and can be used to refer to that individual. However, a individual need not be given an individualID. Such individuals are anonymous (blank in RDF terms) and cannot be directly referred to elsewhere. The syntax here is set up to mirror the normal RDF/XML syntax.

<fact> ::= <individual> 
<individual> ::= Individual( [<individualID>] {type=<classID>}
                       {<propertyValue>} )
<propertyValue> ::= ( <individualvaluedPropertyID>  <individual> )
                  | ( <datavaluedPropertyID>  <dataLiteral> )

Facts can be used to make individual IDs denote the same individual, or pairwise-distinct individuals.

<fact> ::= SameIndividual( <individualID> {<individualID>} )
<fact> ::= DifferentIndividuals( <individualID> {<individualID>} )

3. Abstract Syntax for OWL Lite

As well as the full language, a portion of OWL that is easier to learn is being specified. This portion, OWL Lite, allows for most of the constructs of OWL, so only the differences will be given here. Any EBNF production for OWL that is not modified here or explicitly disallowed is allowed in OWL Lite.

OWL Lite ontologies includes the non-logical portion of ontologies as well as the include directive and axioms and facts. However, as seen below, some kinds of axioms are different in OWL Lite.

Class axioms are different in OWL Lite. Classes can no longer be defined in terms of arbitrary descriptions; instead only named superclasses and certain kinds of restrictions can be used. Neither enumerated classes nor disjointness of classes can be specified. The EquivalentClasses and SubClassOf axioms are not allowed, as they would just duplicate the effect of EquivalentClass and SubClass.

<axiom> ::= Class( <classID> <modality> {<classID>} {<restriction>} )

OWL Lite allows most of the characteristics of OWL properties, except that domains and ranges can only be classes or datatypes, not arbitrary descriptions or enumerations.

<axiom> ::= DataProperty ( <datavaluedPropertyID> {super=<datavaluedPropertyID>}
                           {domain=<classID>} {range=<datatypeID>}
                           [Functional] )

<axiom> ::= IndividualProperty ( <individualvaluedPropertyID> {super=<individualvaluedPropertyID>}
          {domain=<classID>} {range=<classID>} 
          [inverseOf=<individualvaluedPropertyID>] [Symmetric] 
          [Functional | InverseFunctional | OneToOne | Transitive] )

OWL Lite allows equivalence for properties and subproperties.

Restrictions in OWL Lite cannot have embedded descriptions; instead only permitting class or datatype names where descriptions would be allowed. Values also cannot be specified. The only cardinalities allowed are 0 and 1.

<restriction> ::= restriction( <datavaluedPropertyID> [allValuesFrom=<datatypeID>]
                               {someValueFrom=<datatypeID>} [<cardinality>] )
<restriction> ::= restriction( <individualvaluedPropertyID> [allValuesFrom=<classID>]
                               {someValueFrom=<classID>} [<cardinality>] )
<cardinality> ::= atleast( 1 )
                | atmost( 1 )
                | exactly( 0 )
                | exactly( 1 )

OWL Lite includes all the facts of OWL.