A Review of
"Coordination points between RDF(S) and DAML+OIL"
DRAFT
Frank Manola
24 July 2001
Coordination
points between RDF(S) and DAML+OIL is input from the DAML+OIL
Joint Committee to the RDF Core working group. "Coordination points"
describes relationships/dependencies between features of RDF(S) and DAML+OIL,
and specifically identifies areas of RDF and RDF Schema needing attention
to more closely align RDF(S) and DAML+OIL. This paper is a preliminary
review of "Coordination points". It has not yet been reviewed by
the RDF Core working group.
The content of this paper consists of the content of "Coordination
points", with additional comments added (set off by Comment: ).
Overall structure:
Comment: For the most part, the issues raised in "Coordination
points" are associated with one or more issues in the RDF
Core Issue Tracking document. However, a number of these
issues have not yet received any attention by the WG. In some cases,
I have explicitly cited some of the related issues (with no guarantee
that my citations are exhaustive). In addition, there are a number
of more general issues in our issues list (such as rdf-formal-semantics),
dealing with which would probably also deal with issues raised in
"Coordination points". My own view is that "Coordination points"
identifies key issues that RDF Core needs to address, and they should receive
very high priority.
What does DAML+OIL depend on from RDF(S)
-
DAML+OIL builds on the basic triple-structure provided
by RDF. In its essence all that DAML+OIL does is to assign a specific interpretation
to certain designated RDF triples (see the corresponding
remarks in the DAML+OIL reference document and in the model
theoretic semantics of DAML+OIL.
-
Some of these designated triples are inherited from
RDF Schema. DAML+OIL assigns a semantics for these triples which we believe
was also the intended semantics of these triples in the RDF Schema definition.
We use RDF Schema classes, the class hierarchy organisation, and the structuring
of properties: rdf:type, rdfs:class, rdf:value, rdfs:subClassOf, rdfs:subPropertyOf,
rdfs:domain and rdfs:range.
Comment: There seems to be no need for RDF Core to "officially"
comment on this section. The matter of what RDF(S) capabilities DAML+OIL
is dependent on is a matter for the DAML+OIL designers. The approach
of building on RDF by assigning a specific interpretation to certain RDF
constructs seems to be the intended way that other languages are to be
built on RDF.
What does DAML+OIL not use at all
We have chosen to not assign semantics to the following elements of RDF
and RDF Schema:
-
Reification of statements: The reason
for this is twofold. Firstly, no such language construct was felt necessary
for a basic ontology infrastructure language, and secondly the semantics
of the RDF Schema construction is poorly understood.
Comment: There seems to be no need for RDF Core to "officially"
comment on the need for reification in a basic ontology infrastructure
language. Again, this is a matter for the language designers.
I think it's safe to say that RDF Core agrees that the semantics of reification
in RDF are poorly understood. There are several issues on the
issues list concerning reification (rdfms-reification-required, rdfms-quoting,
rdfms-nestedbagIDs), and the general subject of reification has been discussed,
but as yet without resolution. Reification is also related to issues
involving the semantics of URIs and how they are assigned (since the main
reason cited for reification in the M&S is to provide a way to assign
a URI to a statement, enabling additional statements to be made about that
statement). RDF Core is also investigating a "layering" of the specifications
which, assuming no further clarification, will at least move reification
to something like a "standard library" rather than it being a fundamental
part of RDF. RDF Core is also beginning to consider a model-theoretic
semantics, which, together with related work, may help clarify the
status of a number of RDF concepts, reification among them.
-
Containers: Although a notion of containers
is required in DAML+OIL (in particular sets), the containers as provided
by RDF have the wrong properties to be useful for us. An important property
of containers in RDF (whether by accident or by design) is that they cannot
be "closed". External sources can always make additional statements the
contents of a container, effectively "adding" elements. In DAML+OIL, it
is crucial to be able to state that a container contains exactly the indicated
arguments, and >*no more*<. This is important in for instance the "disjointUnionOf
statement". Such a "closure condition" on containers is impossible (or
at least very hard) to express using RDF containers.
Comment: RDF Core has resolved several specific
issues regarding containers (rdf-containers-syntax-ambiguity, rdf-containers-syntax-vs-schema,
rdf-containers-formalmodel). However RDF containers are still not
"closed" in the sense described above. The issues list continues
to include other issues regarding containers (rdf-containers-otherapproaches,
rdfms-seq-representation, rdfs-constraining-containers). The "layering"
of RDF specifications mentioned above is also expected to treat containers
as a "less fundamental" part of RDF. [My own view is that the current
"open" RDF container model may possibly be based on a misunderstanding
of how the actual requirements for "openness" need to be handled, and that
further clarification of how containers are to be used (including explicit
discussion
on how "closed" containers should be handled) should be added to the specifications.
This is because requirements exist for "closed" containers in specific
scopes/contexts in RDF as well as in DAML+OIL (hence there is also
a relationship to more general scoping/context issues). That is,
it is necessary to be able to define "closed" containers as being part
of specific sets of assertions. While others may assert that additional
members of those nominal containers exist, they can do so in separate "closed"
containers, within separate, distinguishable sets of assertions.
This also seems like a necessary capability in DAML+OIL. ]
-
RDFS - meta-class organization: With
meta-classes in this context we mean classes that have other classes as
their members. RDF Schema allows such meta-classes (see e.g. the
explanation on rdf:type in the RDF Schema recommendation). DAML+OIL
allows only instances to be members of classes. Classes can be a subclasses
of other classes, but never members. I.e. DAML+OIL does not contain meta-classes.
To be sure, as part of a DAML+OIL document, one can write a statement making
one class an instance of another class, but such a statement is not assigned
any DAML+OIL semantics.
Although meta-classes are therefore not available to users of
DAML+OIL (in the sense described above, namely that DAML+OIL provides no
semantics for it), meta-classes in RDF Schema have been used by the designers
of DAML+OIL to define DAML+OIL in terms of RDF Schema. For example, the
RDF
Schema definition of DAML+OIL starts with the definition of two meta-classes,
namely the class of object-classes and the class of all datatype classes.
Summarising:
-
meta-classes are classes with other classes as members
-
RDF Schema allows such meta-classes
-
such meta-classes are not given any semantics under DAML+OIL
-
the RDF Schema definition of DAML+OIL does exploit such meta-classes.
Comment: RDF Core has not yet discussed RDFS issues to any
significant extent. Work on the RDF model theoretic semantics
may help clarify some aspects of this issue. The issue also seems
related to clarifying differences/similarities between RDF's abstract concepts
and those of object-oriented type systems.
What changes does DAML+OIL require
As indicated in the DAML+OIL
reference document and as summarised in a
message to the www-rdf-logic mailing list, DAML+OIL takes exception to
the intended/inferred semantics of RDF Schema in three places:
-
multiple domains should be read conjunctively
(= intersection)
-
multiple ranges should be allowed, and again read
conjunctively
Comment: There is an issue on the issues list (rdfs-domain-and-range)
specifically identifying the above two questions concerning multiple domains
and ranges. There has not yet been any discussion of this issue,
and it should get accelerated treatment.
-
both the subclass- and subproperty-hierarchy should
be allowed to contain cycles, in particular in order to express equality
of classes.
Comment: There is an issue on the issues list (rdfs-no-cycles-in-subClassOf)
specifically identifying this question. There has not yet been any
discussion of this issue, and it should get accelerated treatment.
What areas are problematic
-
Human unreadable syntax: RDF(S) XML syntax
is well suited for machine processing, but very unwieldy for human use.
There is a real need for such a human readable syntax: ontologies are being
scribbled on white-boards, sent around in email-messages, used in presentations,
etc. This holds for RDF(S) information as much as it does for DAML+OIL
information.
Comment: This is clearly a good point. While the
graphical notation is intended to support some of this "scribbling", it
is incomplete as a notation (e.g., for expressing RDFS content).
One issue to be addressed in this connection is whether this syntax should
be thought of in terms of one human-readable syntax or multiple
syntaxes (based on the same underlying abstractions). However,
defining human-readable syntax seems clearly secondary in importance to
resolving those issues dealing with the proper semantics of RDF(S)
constructs. Another issue is whether this is within the charter of
RDF Core.
-
Normalised constructions: There is currently
no way to preserve different syntactic forms for the same RDF graph in
many RDF implementations. Although strictly speaking two syntactic forms
of the same RDF graph are equivalent, the syntactic forms nevertheless
still often carry different connotations for human readers and writers.
These subtle differences between different syntactic forms are lost because
it is not possible in RDF to distinguish between different syntactic forms
(since they correspond to the same graph-model, and are therefore equivalent).
Comment: There is currently a related issue (rdf-equivalent-representations)
on the issues list. An apparently-related issue (rdfms-syntax-incomplete)
is listed as "postponed". However, I personally need some additional
guidance as to the intent of this area (e.g., examples of the differences
and connotations that should be preserved).
-
Datatypes: The March
2001 release of DAML+OIL included datatypes in the language, using
the basic datatypes from XML Schema as its foundation. It would be more
appropriate if RDF Schema already provided such datatypes (possibly in
a similar way as is now done in DAML+OIL).
Comment: Inclusion of primitive datatypes is anticipated
in the current RDFS specification, and there are related issues on
the issues list (rdfs-xml-schema-datatypes, rdfms-literals-as-resources,
rdfms-literalsubjects). The issue rdfms-literals-as-resources is
significant because DAML+OIL explicitly separates basic data types from
DAML+OIL objects. Hence, this raises the issue of whether the
DAML+OIL motivation for this separation should also apply to RDFS (and
if it doesn't, does merging resources and literals create difficulties
for DAML+OIL?).
-
Scoping: Some of the DAML+OIL syntax has
become unwieldy because RDF does not provide a "scoping" or "bracketing"
mechanism. The syntax for property restrictions in DAML+OIL is an example
of this.
Comment: A number of issues relating to "scoping", and
"contexts" need to be examined by RDF Core. The particular reference
to DAML+OIL property restrictions seems part of a general problem of trying
to strike an appropriate balance between the property-oriented approach
used by RDF (in which properties have universal scope) and a more object-oriented
approach (in which the scope of a property is restricted to a particular
class). The issue is also related to the multiple domain and range
problem mentioned above.
-
Layering: In general, we feel that the
layering between RDF and RDF Schema is not very natural: both language
provide very natural constructions (data-triples, class-hierarchy), but
also contain rather sophisticated constructions (e.g. reification) that
we would not expect in such basic languages.
Comment: RDF Core agrees that layering is an issue, and,
as noted already, has plans to develop an alternative layering of language
facilities. Some of this requires more explicit consideration of
RDFS than has currently taken place, however.
-
Syntax of URIs: Although there is a syntax
for a general URIs, there does not appear to be a good specification for
the URIs that appear in RDF documents, in particular, how URIs are composed
in the presence of namespaces and RDF ``fragments'', including whether
namespaces are permissable in URIs that are not XML URIs. There has been
considerable RDF debate on these points.
Comment: RDF Core agrees that URI syntax (and semantics;
see below) needs further examination. A number of issues relating
to URI syntax are already on the issues list (rdfms-fragments, rdfms-uri-substructure,
rdfms-qname-uri-mapping).
-
Semantics of URIs: It seems reasonable
to interpret URI's as logical constants. This raises the question what
the total universe of discourse is for interpreting DAML+OIL. (This is
important as the semantics of a DAML+OIL knowledge base is relative to
a given universe of discourse).
Does the set of URI's have a non-trivial structure? In particular:
-
Can two syntactically-different URIs point to the same domain element?
-
can the same URI string name a different resource at different times?
-
Can a URI (sometimes) fail to denote
-
What is the relation (if any) between the semantic denotation of a URI
as a logical constant and that URI accessing a web-page (if any)?
Comment: RDF Core agrees that the semantics of URIs need a
great deal of attention. There are a number of issues on the issues
list relating to this topic (rdfms-resource-semantics, rdfms-identity-anon-resources,
rdf-equivalent-uris). Work on the RDF model theoretic semantics and
related analysis should help deal with this issue (although the semantics
of URIs are also matters for groups outside the "RDF community").
RDF Core discussion to date has generally agreed with the interpretation
of URIs as logical constants. However, work specifically on URIs
probably needs more focused attention than it has received so far.
$Revision: 1.0$ of $Date: 2001/07/24 13:30:14 $