Re: Why RDF Schema? from Ralph R. Swick on 1998-11-25 (www-rdf-comments@w3.org from October to December 1998)

From: Ralph R. Swick <swick@w3.org>
Date: Tue, 24 Nov 1998 23:10:00 -0500
To: "Mark D. Anderson" <mda@discerning.com>
Cc: <rdf-dev@mailbase.ac.uk>, <www-rdf-comments@w3.org>
Message-Id: <3.0.5.32.19981124231000.02d16ba0@127.0.0.1>
Mark; your skepticism about the multiplicity of projects calling
themselves "xxx Schema" is quite understandable.  We have each
done a poor job of explaining the relationships and explaining
how each one does (or perhaps doesn't) fill in a piece of the
overall picture.  We are overdue for some updating of the material
on the W3C Web site to show how each of these pieces fit together
to complete parts of the whole Web puzzle.

So, I promise to work harder to find time to help produce such
an explanation.  Some of my colleagues have promised to help
with this too.

In the meantime let me give you a quick response so as to let
you know that your question is not being ignored.

XML is all about syntactic structure.  An XML DTD tells you what
XML elements are permitted to appear nested within what other
XML elements and with what XML attributes and what values for
those attributes.  But an XML DTD does not provide any way to
explain -- to the machine -- what the meanings of those elements
and attributes might be, nor does it tell you how you might be
able to combine elements defined in independent document types
within a single document.

DCD, XML-data, and SOX each address the syntax of an XML document;
extending the capabilities provided by DTDs in useful and
important ways.  But the purpose of these remains that of producing
a language to define what it means to be a syntactically valid
XML document.  The W3C XML Schema work is tasked with delivering
a recommendation on the best combination of these systems for
validating document syntax.

The purpose of RDF on the other hand is to make some progress on
interchange of meaning on the Web.  RDF is all about semantic
structure.  The purpose of an RDF Schema is not to give rules
about the syntax of a document but rather to give rules about
whether, for example, it is meaningful to talk about "parents"
of things that are of type "furniture" or "weight" of things
that are of type "Web page".

I will be the first to admit that the sorts of meaning that can
be reasonably conveyed in a machine understandable way by this
first version of RDF Schema is quite small.  Much of the focus
has in fact gone into exchanging machine understandable
specifications of information intended for presentation to humans
when authoring, or reading, an RDF description; e.g. the localized
name for a property transmitted on the Web as the token "weight".

An explicit goal of RDF is to allow independent communities to
define descriptive vocabularies (that is, what RDF calls properties)
and to allow a metadata author to combine these vocabularies into
a single instance of a resource description.  Even if you believe
that this can  be addressed as a problem of syntax, this kind of
combinatorics is impossible to do with today's DTD technology.
And it may be a daunting task to find a general solution within
the framework of a replacement DTD technology; time will tell.

Semantic models have little value to the Web if there is no
specification of at least one interoperable way to exchange
these models.  This -- specifying a syntax for encoding
properties and their values for interchange purposes -- is
the function of the second half of the RDF Model and Syntax
specification.  In the future there are likely to be other
syntaxes, including other XML syntaxes, that express
RDF models in semantically equivalent ways.  In point of
fact, if it were possible to define the RDF/XML syntax given
in the RDF Model and Syntax specification with a DTD we might
well have done so.  But a DTD works only for a specific
(combination of) metadata vocabularies (properties); a DTD
does not work in the general case that RDF is addressing.

So there is a very clear separation of function between XML Schemas
(syntactic validation) and RDF Schemas (semantic validation).

One area of potential overlap between RDF Schema and XML Schema does
exist; that of data typing.  In fact, when we started the RDF Schema
work we recognized that a necessary part of semantic validation was
going to be type checking.  At the time there was no proposal on the
table to add data typing to XML itself, so we began to define an
extensible data typing system within RDF Schema.  Everyone soon
realized that RDF was not unique in needing such data typing and
furthermore that modularity of various applications of XML would
be enhanced by adding this facility to XML itself rather than
defining it independently in each layer above XML.  So data typing
is now part of the task of the new XML Schema deliverable.  We have
tabled further work on data typing within RDF until after these
facilities are defined at the XML level in order to minimize
divergence of models.  Thus RDF Schema contains only the barest
minimum that the Working Group feels is essential to fill an
immediate need for semantic data typing.  We have applied our
some architectural judgement to the choice of features in order
to maximize the probability that more extensive semantic data
typing can be added in the future in a way that leverages the
new facilities that we expect from the XML layer.

As you say, the task of picking the highest priority features
to support a broad range of metadata applications without
making the scope of the work so large that it becomes a
several-year endeavor is a daunting one.  The purpose of public
review of W3C documents is to solicit comments on the specific
choice of features proposed.  Broad statements such as "there
is no group for which this is useful" are easy to disprove.
Specific statements such as "the addition of the following
feature, or the following change to a proposed feature, would
make the work applicable to the following problem that you
may not have considered" are more helpful.  Identification
of potentially harmful interactions between proposed features
in one specification and proposed or envisioned features in
another area are also helpful.

I thank you for taking the time to engage in this discussion
and for (again) prodding us to further clarify the architectural
partitioning.  May we all be better informed as a result.

-Ralph Swick
 W3C/MIT
Received on Tuesday, 24 November 1998 23:10:41 UTC