Reification from Drew McDermott on 2001-04-04 (www-rdf-logic@w3.org from April 2001)

From: Drew McDermott <drew.mcdermott@yale.edu>
Date: Wed, 4 Apr 2001 12:05:43 -0400 (EDT)
To: www-rdf-logic@w3.org
Message-Id: <200104041605.MAA04471@pantheon-po01.its.yale.edu>
At the risk of repeating myself and others, let me expand on why
reification is a bad idea, at least for most of the purposes it
has been suggested for.  

One such purpose is expressing implication.  Instead of thinking of
implication as relating two propositions, RDF thinks of it as relating
two expressions, which *denote* two propositions.  Example:

(implies '(sent LLBean refund) '(lost USPS refund))

[I tried to write this in N3, but I am not sure I understand how the
latest version of N3 works, so I settled for a vanilla prefix form.]
The quotation mark before the two subexpressions converts them into
names for them.  

This seems like a minor change from just saying (implies (sent LLBean
refund) (lost USPS refund)), but it isn't.  Let me explain why.

First, it seems unnecessary.  Quotation has historically been used for
formalizing actual references to expressions, as in 

(said Pierre "Sacre bleu!")

or for handling "opaque" contexts such as 

(believes Lyndon '(> (cardinality U.S.-supreme-court) 10))

By "opaque" we mean that many normal operations are blocked when such
a context is encountered.  For instance, since the number of judges on
the U.S. Supreme Court is 9, if we were allowed to substitute equals
for equals in the formula above, we would conclude

(believes Lyndon '(> 9 10))

which should not follow.  The quote saves us because the term
(cardinality U.S.-supreme-court) *does not actually occur* in 
'(> (cardinality U.S.-supreme-court) 10).  At best the *name* of the
term occurs in it.  

Implication is much simpler than all that.  (implies P Q) means simply
that either P is false or Q is true.  It is perfectly "extensional."
Using reification here is a classic example of fixing a bug with a
bigger bug.  For instance (as I've said before), we run into all sorts
of snags with formulas such as (implies (implies A B) C).  Do we
represent this as

(implies '(implies 'A 'B) 'C)

or as (implies '(implies A B) 'C)

The former commits us to thinking of names of names of names ... of
terms; the latter commits us to figuring out what '(implies A B) is
the name of (since it certainly isn't the name of (implies 'A 'B)).  
[Or it commits us to a separate language in which implication no
longer involves quoting, in which we can follow Peter
Patel-Schneider's suggestion and switch to that language, writing
(assert '(implies (implies A B) C) true)
in which case RDF's only remaining role is to tell us that assert
exists.] 

Furthermore, when it comes to substituting equals for equals, we don't
*want* opacity.  That is, the following are true: (implies x T) = T, and 
(implies T x) = x.  So, in (implies (implies A B) C), if we know B is
true, two substitutions should allow us to simplify the formula to
just C.  Are we allowed to do them or not?  Is this something that has
to be specified at the meta level, and if so how?

Note that by using reification to represent implication, we have, so
to speak, "used it up"; it's no longer available to represent genuine
opaque contexts.  So, if someone write

(implies '(believes Leo '(type Sally robot))
         '(dangerous Leo Sally))

are we or are we not allowed to substitute (wife Leo) for Sally in the
case where these two terms are known to be equal?  (And do we have
enough quotes, or too many, in that formula???)

Recently it has been suggested that we can also use reification for
quantification, writing things like

(forall 'x '(implies '(nationality x Brazilian) '(loves Leo x)))

Here I suppose the idea is that 'x denotes a variable.  For every value
of that variable, the formula after is made true by substituting that
value for x.  Oops, x doesn't occur in that formula, only its name.  I
guess I meant

(forall 'x '(implies (nationality x Brazilian) (loves Leo x)))

and the non-RDF sublanguage starts to poke its inexorable head up
again, so that the temptation to switch to

(assert '(forall x (implies (nationality x Brazilian) (loves Leo x))))

gains strength.  I'll continue to resist, however.

There are a lot of problems with representing quantifiers this way.
The main one is that we have to deal with cases where the scopes are
ambiguous.  Consider the claim that "Leo is planning to marry an axe
murderer."  This has two readings:  

Leo hasn't chosen a bride yet, but is attracted by the mystery and
danger of being married to someone who has killed at least one
previous husband with an axe.

Leo is planning to marry Lizzy.  Unbeknownst to him, she killed her
previous husband with an axe.  Oh my God, Leo is planning to marry an
axe murderer!

Let's assume that in (plans a p), the first argument is extensional,
and the second is opaque.  That is, what you plan to do is so
intertwined with what you believe to be the case that you can't
substitute equals for equals in someone's plan.  (As in: "G.W. Bush
plans to improve the quality of life by increasing the CO2 level.")

Now consider

(plans Leo '(exists 'm (and (hobby m axe-murderer) (married Leo m))))

It is plausible that this captures the first reading of the sentence.
For the second, we'd like to say something like

(exists 'm '(and (hobby m axe-murderer) (plans Leo '(married Leo m))))

"There is someone whose hobby is axe murder such that Leo plans to
marry that person."

The problem is that m does not occur in the second argument to
exists, only the name of m.  Furthermore, we can't get rid of the
inner quote, because 'plans' needs an opaque argument there.  Somehow
we have to resort to quasi-quoting, that is, building up the arguments
to exists in such a way that the m stays visible:

(exists 'm '(and (hobby m axe-murderer) (plans Leo `(married Leo ,m))))

The backquote and comma are Lispish.  `e means "quote everything in e
except the stuff prefixed by a comma."  So m *does* occur in this
version of the second argument to exists.  The term `(married Leo ,m)
produces the name of a formula when we substitute the value of m into
it.  Hmm.  What does it mean to substitute a value into a partial
name?  I can see what it means to substitute a *name* into a partial
name.  That is, if I substitute m="elizabeth-q-borden" into 
`(married Leo ,m), I get the name of the proposition (married Leo
elizabeth-q-borden).  Oh, rats, we've had this problem as soon as we
introduced quantifiers.  Do we have to assume that every object has a
name?  A unique name?  Is there any way to say, "Leo is planning to
marry *this woman*" as opposed to "Leo is planning to marry whoever is
named elizabeth-q-borden"?  

In my opinion, this is an extremely messy way to approach what is
basically a simple problem.  At any point in this downward spiral we
can jump ship and switch to a non-RDF language.  Indeed, the only
reason to stick with an RDF language inside the quotes is to fool
ourselves into thinking we haven't left RDF; it's RDF outside the
quotes and RDF inside.  But the stuff inside the quotes requires all
sorts of machinery that we didn't need outside them, so we really are
fooling ourselves.

                                             -- Drew McDermott
Received on Wednesday, 4 April 2001 12:05:49 UTC