Re: what RDF is not (was ...)

Here's my little rant on what RDF is.   Not directed at anyone in
particular. 

RDF is a language for transmitting pieces of collaborative databases.
It started as a way to categorize web pages, but since the subject
matter of the web is arbitrary, RDF ended up as a way to express
arbitrary information, just like one might store in a relational DBMS.
The pieces of RDF are peices of a web-wide database of information,
not just about web pages but about anything.

While SQL is a database manipulation and query language, RDF is just a
data format, equivalent to the tables that result from a SQL query or
to an on-disk database file format.  (RDF still needs a SQL-equivalent
language.)  RDF's database model is different from SQL's in being
"webized" to support distributed collaboration: tables/columns and
datatypes are named in a global namespace (URIs) so they can be
automatically linked.

There is a temptation to think a mass of RDF fragments can store all
of human knowledge.  The truth is that RDF is only marginally better
than a typical SQL database for storing "knowledge".  It works well
for a catalog of the CDs you own, or the products you sell, or the
configurations of software installed on your computers, but the only
thing it does for "knowledge representation" and "machine reasoning"
is provide a standard underlying format.

(If RDF sounds a lot like XML, well, it is.  The difference is that an
XML database fragment is less self-describing than an RDF one.
Whether this difference is critical is a subject of debate.  Whether
either of them is better than a comma-separated-values file is also
subject to debate.  The basic question is whether self-description is
important.)

So how do you encode some knowledge like "All men are mortal" or "Only
3 Sale-Items Per Customer" in RDF?  The same way you do in SQL: you
don't.  You need another mechanism - some logic somewhere else in the
system.  It may, however, be a standard logic, driven by information
also in the database.  That is, the database can hold software written
in some programming or constraint language (Perl, Python, i386 machine
code, first-order predicate logic, DAML, RDFS, etc), and there can be
conventions about how apply that knowledge to other knowledge in the
database (eg for database validation or inference).

Putting other-language elements into a database like this is a common
design style for complex database applications.  Additionally,
database systems which do validation or inference (as many of them now
do) often make available data views of the logic-language expressions.
It's a fairly obvious technique.  RDF may cloud the issue by
encouraging a different encoding style, where you encode each
logic-language token (instead of each whole ASCII expression) in
separate RDF objects.  This somehow makes the language look more like
it's "in" RDF or extending RDF; the truth is, for RDF, it's only data
in the database.

     -- sandro                 http://www.w3.org/People/Sandro/

Received on Wednesday, 2 January 2002 12:21:38 UTC