Re: Owl Rules and RDF Semantics from Drew McDermott on 2003-11-06 (www-rdf-interest@w3.org from November 2003)

From: Drew McDermott <drew.mcdermott@yale.edu>
Date: Thu, 6 Nov 2003 17:18:02 -0500 (EST)
To: www-rdf-interest@w3.org, www-rdf-logic@w3.org
Message-Id: <200311062218.hA6MI2H13220@pantheon-po03.its.yale.edu>
   [Adrian Walker]
   I have been reading with great interest Ian Horrocks' excellent
   proposal for OWL Rules,
    [1]http://www.cs.man.ac.uk/~horrocks/DAML/Rules/  , in which he gives
   a model theory for the meanings of collections of rules.
   Ian writes:
      "Rules have variables, so treating them as a semantic extension of
      RDF is very difficult. It is, however, still possible to provide an
      RDF syntax for rules---it is just that the semantics of the
   resultant
      RDF graphs will not be an extension of the RDF Semantics."
   To a mere an outside observer of a major project like OWL/RDF, this
   looks kind of strange.
   It brings to mind some questions:
   Why have RDF do any inferencing at all?
   Should RDF in future be restricted just a passive data representation
   ?
   Is it just a distraction to try add the limited inferencing currently
   posited for RDF itself, since it looks to be incompatible with useful
   inference-based Semantic Web applications ?
   Or, have I missed something here ?   Thanks in advance for your
   comments.

Where to begin answering?

RDF is so inexpressive as to be useless.

Therefore, people have sought the magic phraseology to pretend to
remain in the RDF world while sneaking out of it.  OWL Rules is the
latest effort, and the most likely to succeed, partly because Ian and
Peter are very smart, and partly because the pent-up demand just keeps
increasing.

The sneaky part arises in the contrast between the bit you cite and
the (true) claim made elsewhere in the same report that OWL Rules has
a formal semantics (colloquially known as its "model theory") that is
a natural and consistent extension of the formal semantics of RDF and
OWL.  The claim is true if you interpret an atomic formula of an OWL
Rule as though it were an "atomic formula" of OWL -- i.e., a subClass
assertion, a class-membership assertion, or a triple (and maybe a few
other things).  But the "leaves" of an OWL Rule are not in fact any of
these things, just encodings, roughly of the form:

   <I-am-an-OWL-subclass-assertion>
       <This-is-the-subclass rdf:resource="R1"/>
       <This-is-the-superclass rdf:resource="R2"/>
   </I-am-an-OWL-subclass-assertion>

You then decree that this has the same meaning as

   <owl:Class rdf:about="R1">
      <rdfs:subClassOf rdf:resource="R2"/>
   </owl:Class>

You do the same thing with the other atomic-formula-like entities in
your language.  Now the semantics of the original language and the
semantics of the rule language merge seamlessly together.

But of course without the decree the "I-am-an-OWL-subclass" version
doesn't have that meaning at all.  It either has no particular meaning
(it can vary freely between models), or it can be interpreted as
_describing_ the real "owl:Class" version.  Hence the rules have RDF
syntax, and can perfectly well be interpreted using the RDF formal
semantics, but the results are useless -- harmlessly useless, but
useless. 

It seems to me that the right way to describe this situation is to say
that RDF can be used to describe the syntax of an arbitrary language.
That language then has a reasonable semantics if and only if it had a
reasonable semantics already; the fact that it is encoded in RDF is
irrelevant.  The reason I thought this was the right way is that I
thought it was a good idea to come up with conventions for encoding
arbitrary languages in RDF.  I thought that was a good idea because I
thought the RDF engineers had no business telling people what
inferences their software agents ought to be able to carry out.

It turned out that we (Dejing Dou and I, in our 2002 ISWC paper, and
several other people whose ideas went into that paper) failed to
convince very many in the SW community, because we were using the
wrong collection of magic words.  We said we were reifying atomic
formulas, which, as anyone can see, is exactly what OWL Rules does.
We thought that it was good sportsmanship on our part to acknowledge
that reification is a good idea for exactly the purpose of hiding
atomic formulas from the voracious triple-eating semantics of RDF.
But "reification" was a word with negative connotations; the right
magic word was "layering," which is still going strong.  Other people
(notably M. Sintek, S. Decker: TRIPLE---An RDF Query, Inference, and
Transformation Language, DDLP'2001) had proposed essentially the same
idea, but called it "layering."  Because of comments from previous
referees who had shot our paper down, we had to explicitly explain why
our idea was good even though it didn't involve layering.  We didn't
realize that layering has no meaning -- unless it means you can produce
a variant of Tim Berners-Lee classic slide with your notation perched
above something people already believe to be "layered," such as RDF.

I don't mean to sound bitter -- I really don't.  I don't think that our
notation is superior to OWL Rules, or that everyone should cite Sintek
and Decker, or our paper (Drew McDermott and Dejing Dou 2002
Representing Disjunction and Quantifiers in RDF.  Proc. Int'l Semantic
Web Conference.  Available at http://www.cs.yale.edu/~dvm).  I'm not
even expecting to be named a W3C Fellow.  What I would like to see is
a more consistent and open-ended approach to formal languages on the
semantic web.  In particular:

> The mechanism for encoding languages in RDF should be independent of
  any particular language.  There are two basic requirements for
  encoding a language in RDF: Make sure the triples in the encoded
  version don't actually say anything in the domain being encoded; and
  figure out a way to handle bound (or free) variables.  There are
  several proposals on the table (I should mention the RDF encoding of
  RuleML).  It would clarify things if W3C would endorse a standard.
  It would be easier to compare languages without being distracted by
  the details of encodings.

> It should be acknowledged that encoding a language is an exercise in
  _syntax_.  Somehow the idea of the "semantic" web has got people to
  shun syntax as something obsolete or unclean.  They are willing to
  produce a formal syntax for their language, but only as a dirty
  chore to be cleared away before getting to the semantics.  That
  means we see paper after paper that encodes language after language
  in RDF in slightly different ways, where the titles of the papers
  emphasize the difference in semantics or inferential power between
  their language and someone else's.  The poor reader gets to see lots
  of angle brackets and some mumbling about the RDF model theory.
  These papers would be a lot clearer if they used a concise surface
  syntax, and then mentioned the W3C standard for embedding languages
  in RDF, with a few words about how their syntax can be encoded using
  the standard.  If this were done, the existence of a model
  theory for RDF would be in most cases automatically irrelevant,
  because the standard would have already established that the
  universe of formulas has a formal description in RDF.

> The syntactic standard should be "open-minded," in the sense that it
  avoids putting constraints on the embedded language.  It is really
  irritating for the designers of rule systems for RDF/OWL to try to
  keep their systems decidable, or even to warn people that they are
  undecidable.  It's like the phone company trying to make sure that
  it's impossible to have a conversation about an unsolvable problem.
  The job of a syntactic standard is to make it easy for people to
  encode what they want to encode.  People working on SAT solvers know
  that the problem is undecidable; why should notation designers for
  the SW be peering over their shoulders and clucking at them?

-- 
                                             -- Drew McDermott
                                                Yale University CS Dept.
Received on Thursday, 6 November 2003 17:18:04 UTC