Re: RDF semantics: applications, formalism and education from Drew McDermott on 2001-04-09 (www-rdf-logic@w3.org from April 2001)

From: Drew McDermott <drew.mcdermott@yale.edu>
Date: Mon, 9 Apr 2001 11:14:27 -0400 (EDT)
To: www-rdf-logic@w3.org
Message-Id: <200104091514.LAA03545@pantheon-po01.its.yale.edu>
   [Danny Ayers]
   > The alternatives
   > then are 1. to kill RDF and/or 2. create an altogether new 
   > framework and/or 3. extend RDF to make it do what we want.

   [Peter Crowther]
   I'd add (4) AMend RDFS to make it do what we want.

   > 3. would seem to be the least bad/most likely option - add 
   > functionality in
   > the form of external schemas, if need be gaffer (duct) tape in
   > characteristics from different domains (calculi/algebras 
   > whatever - any volunteers for the Chinese Room?)

   These can only be added if the core is sufficiently expressive and flexible
   to allow them to be added.  Pat, Peter and others seem to be pointing out
   that:  ....

Let me put it this way.  RDF in its original form is a very simple
language, which does not contain negation, disjunction,
quantification, modality, etc.  It is good that there is a simple core
language, because all those features come at a fairly high price.  The
more complex the language the more difficult it is to manipulate it.
(The difficulty increases very rapidly.)  The question is, How do we
get the more complex features when we need them?  There are two basic
answers:

1 View the original RDF, call it RDF_0, as the minimal subset of a
more complex language.  Set up the subsets in some rational way, so
that it's fairly easy for users to know which subset they need.  Adopt
a system of labels so that a given RDF description can be clearly
flagged as requiring subset I.

2 Introduce quotation into the language, and use RDF_0 purely as a
vehicle to describe more complex languages.

Perhaps I'm biased, but option 1 seems so obviously to be preferable
that no argument is required.  In fact, I know of no argument in favor
of option 2, only an argument against option 1, which goes as follows:

i. Every RDF expression is equivalent to a conjunction of statements,
that is, "triples," of the form p(a,b), where p is a relation and a
and b are atomic names (URIs, e.g.); furthermore, if e2 occurs as a
subexpression of e1, the triples of e2 are a subset of the triples of
e1. 

ii. Therefore, if you assert an RDF expression you must assert every
element of this conjunction, and hence assert all of its
subexpressions. 

iii. Therefore, it is impossible to add negation or disjunction to the
language, because p is a subexpression of (not p), so asserting (not
p) requires asserting p.

iv. Therefore, there are no nontrivial supersets of RDF_0

In my opinion, this should be viewed as a reductio ad absurdum of
premise (i).  However, instead it is mostly viewed by the RDF
community as a knockdown argument against extending RDF, and
therefore, by default I suppose, in favor of using quotation to
describe some super-RDF in terms of RDF_0.

But Ayers and Crowther, and others, are beginning to grant that
"extending" and "amending" RDF might be a good idea.  In that case, we
need to amend the way triples figure into the language.

I think the original motivation for the triples model was to allow
one to follow pointers all over the web without worrying what they
were pointing into.  That is, if a document points to a triple in
another document, I could go there and figure out what that one triple
asserts without having to figure out what its context is.  (Its
context might not even be well defined if different people made use of
it in different ways from different places.)

Perhaps others could shed some light on whether this is indeed the
underlying motivation for the triples model.  However, I think one can
have pointers all over the place without the triples model.  What we
should have instead is a well defined notion of "expression."  

I'm visualizing something like this:

I set up a web page where I stoutly deny that Francis Bacon wrote
"Hamlet."

(not <a id="foo">(wrote Bacon Hamlet)</a>)

[usual disclaimer about how ridiculously oversimplified this is]

I've thoughtfully provided an anchor point "foo" that others can use
to point to this assertion.

[another disclaimer about how improbable this is as actual XML/RDF]

Now someone else disagrees with me, perhaps by listing all the plays
Francis Bacon wrote:

(and (wrote Bacon Coriolanus)
     <a ref="foo"/>
     (wrote Bacon HeddaGabler))

where he has pointed to the claim about Hamlet rather than copying it.

I don't see any problem with this, provided we have a clear definition
of what counts as a subexpression of formula foo.  In the present
simple-minded example, it has no nontrivial subexpressions, but in
more realistic examples it would be some large entity with pointers
off to other places.  That's okay, provided we can in principle
follow all those pointers, retrieve the subexpressions they refer to,
and build a coherent expression that is what the person must be
claiming if he includes a pointer to foo in one of his formulas.

Of course we can run into problems, where someone writes something like

  <a id="ying">(not <a ref="yang">)</a>
  <a id="yang">(not <a ref="ying">)</a>
   
If we try to resolve this into a coherent expression, we get 

   (not (not (not (not .....))))

But the fact that some apparent pointers can't be resolved
doesn't kill the idea of pointing to subexpressions.  We just
stipulate that there are no cycles in the pointer graph.  

Note that we can still have triples if we want; we just do away with
the idea that an expression is *equivalent* to the triples from all
its subexpressions.

                                             -- Drew McDermott
Received on Monday, 9 April 2001 11:14:47 UTC