Re: Grist for layering discussion

From: "Tim Berners-Lee" <timbl@w3.org>
Subject: Re: Grist for layering discussion 
Date: Fri, 11 Jan 2002 16:58:12 -0500

> 
> ----- Original Message -----
> From: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>

 [...]

> > Now you want to extend RDF.  OK, lets add mechanisms for creating a class
> > taxonomy and for typing the domains and ranges of roles, and call it RDFS.
> > We'll use the same syntax to create information bases, but modify the
> > retrieval mechanism to make it correspond to the wording in the RDF Schema
> > Candidate Recommendation.  Now you definitely get more out of the
> > information base than you put in.  In particular, you get
> >
> > rdfs:Class rdfs:subClassOf rdfs:Resource .
> >
> > out of an information base that has not had any information added to it.
> 
> I would not expect that.  There is  a set of axioms from the RDFS spec which
> you can put into any database but I see no reason why you should
> assume them in any database.   

If you don't expect this conclusion to be present in every RDFS information
base then you better speak up soon.  It follows from the RDFS model theory.

> For me, a database and a document are the
> same.  Each is just a way of accessing some abstract information.

Sure.

> This way, you get a set of statements which every document is deemed to
> include.

Huh?  A document is a sequence of bits.  It only gets the extra stuff if it
is interpreted as an RDFS information base.  Of course, the document itself
may state that it is supposed to be considered as an RDFS information base,
but that doesn't change the fact that it is just an sequence of bits.  If
by document you mean the information implicit in a document using some
formalism, then sure, but I worry about identifying documents and
information bases.

> That is weird. I would prefer to say that any statement made in a document
> which
> uses OWL terms imples the semantics of the OWL spec, and therefore the
> author is deemed to support the decutive closure of  the union of
> the statements from his document and the axioms.

OK, even when substituting RDFS for OWL.  But then RDFS retrieval is
supposed to retrieve these extra conclusions for any information base, even
the one corresponding to an empty document.  Otherwise it is not RDFS
retrieval, as I defined it above, but some other function.


> > So far so good.
> >
> > We have:
> >
> > 1/ All RDF syntax is also RDFS syntax.
> > 2/ Given an RDF information base, i.e., an information base created using
> >    only the syntax in RDF, the RDFS retrieval mechanism produces everything
> >    that the RDF retrieval mechanism would produce, and maybe more.
> 
> I would say, exactly the same.

How so?  The syntax of RDF is precisely the same as the syntax of RDFS.  If
you want the retrievals to be the same, then RDFS will be exactly the same
as RDF.  (If you instead want the next piece of software down the line to
add this extra stuff, then I'm talking about this next piece of software,
not what you are talking about.)

> > This is precisely what it means to be an extension.
> 
> Not for me. What you have in my view  is a new function, "answer this query
> using the extra knowledge you have about DAML".  This will tell you a
> superset of what you got out of the
> plain query.

Precisely.  

RDF and RDFS and DAML+OIL and OWL retrieval may produce different answers
from the same information base.  If RDFS, or DAML+OIL or OWL, is an
extension of RDF then the results from RDFS retrieval must be a superset
(not necessarily strict in all cases) of the results from RDF retrieval.

> > However, RDF and RDFS have a very uncommon relationship because they also
> > share a syntax.  Let's try to create a same-syntax extension of RDFS to
> > encompass some of propositional logic, namely the part that allows us to
> > create related disjunctions such as John is either married to Susan or
> > friends with Jake.
> >
> > [Why do this extension in particular?  Well it is an extension that shows
> > some of the problems, but it also has a construction that can be fairly
> > naturally expressed as triples.]
> >
> > What doesn't work is to directly encode the missing logical
> > construction.  A direct encoding of this would be something like
> >
> > IB1 John rdfo:or _:x .
> > _:x married Susan .
> > _:x friend Jake .
> 
> That is just nonsense of course.  You can't encode "or" as a conjunction
> of triples.   

Well I can ``encode'' "or" as whatever I want.  I could use triples; I
could use s-epressions; I could use lists; I could use some XML syntax; I
could even use smoke signals.  However, if I use triples, and I want to
create an extension of RDF, then I may run into problems.

> DAML defines some predicates which allow you to do
> similar things.  [Y]ou can define that the class of people married to the
> Susan and the class of people friends of John are disjoint.

(I can't think of a way of saying this in DAML+OIL.)

> > We construct a formal specification of RDFO to incorporate this
> > construction.
> 
> You can't.  You can define the meaning of a property  rdf:or but you can't
> define it in a way which reaches up inside the machine and affects other
> things.

I can do anything I want, as long as I am creating the formalism.  I can
abuse RDF in lots of ways, including making triples mean something entirely
different.  However, again, if I want to be an extension of RDF, then I
have to respect the meaning of RDF triples.  This is precisely what RDFS
does, and what Pat and I are claiming can't be done in general.  However,
there are still lots of things I can do and still be an extension, even
things that result in a broken formalism.

> You can only (1) give an algorithm for deducuing other information
> from a statement and (2) give new axioms - constraints on the world - which
> are related to the property. You can't say  "and if used in a sentence
> it negates the meaning of the previous sentence" for example.

Well, this wouldn't work very well, but only because ``previous'' is not
well defined.  There are lots of things that can be done in an RDF extension,
including providing extra inferences from every RDF triple.  (Consider an
RDF reification extension that mandates the existence of the RDF
reification of every triple in an RDF information base.)

> >  In particular, from
> >
> > IB2 John rdfn:or _:x .
> > _:x married Susan .
> >
> > RDFO retrieval will produce
> >
> > John married Susan .
> >
> > Now is everything OK?  NO!  There are two problems:

> No, it is not OK, you can't do that. 

I just did!  I may have created a broken formalism, but it is a formalism.

> DAML doesn't do that.

Actually DAML+OIL cheats, in some sense.  (Well actually it didn't cheat
when it was created, but it cheats now, because it doesn't work right with
RDF entailment.)

> (N3's log:forSome and log:forAll do and they are not real RDF triples and
> I must fix it so they don't appear to be in the syntax or the code. they
> would have this problem if treated as such.)

Precisely.  log:forSome and log:forAll have severe problems if they are
treated as RDF triples.

> [...]
> 
> > It is possible to overcome these problems, at least partly, by exploiting
> > the reflective properties of RDF.  We can encode the disjunctions using a
> > special construction, something like
> >
> > IB3 John rdfor:or _:l1 .
> > _:l1 rdfor:fact _:f1 .
> > _:l1 rdfor:rest _:l2 .
> > _:l2 rdfor:fact _:f2 .
> > _:l2 rdfor:rest rdfor:nil .
> > _:f1 rdfor:predicate married .
> > _:f1 rdfor:object Susan .
> > _:f2 rdfor:predicate friend .
> > _:f2 rdfor:object John .           [Jake?]
> 
> This is (with s/fact/first)    N3's   John rdfor:or  ( [ predicate married;
> object Susan]
>     [ rdf:predicate :friend; :object :Jake])
> 
> which is close to
> 
>    { :John :married :Susan } rdf99:or { :John :friend :Jake }.
> 
> which is very like
> 
>    { :John :married :Susan }  log:implies  { :John :notFriend :Jake }.
> 
> which works.

Well, I haven't seen any indication that it actually does work in the form
of even a proof sketch that there are no hidden paradoxes, so I remain
unconvinced.

> Now you can regard the {} as either an extension to the grammar of the
> language, which I tend to do, or if you want to be strict and stick with RDF
> you can use the construction above.  here it becomes I think
> something like
> 
> :F1  rdf99:or :F2.
> :F1  x:setOf  :L1.
> :L1   daml:first :s1;    daml:rest daml:nil.
> :s1 :rdf:Subject :John;  rdf:predicate :married; :object :Susan.
> :F2  x:setOf  :L2.
> :L2   daml:first :s2;    daml:rest daml:nil.
> :s2 :rdf:Subject :John;  rdf:predicate :friend; :object :Jake.
> 
> # same as:
> 
> [ is x:setOf ( [:rdf:Subject :John;  rdf:predicate :married; :object
> :Susan.])]
>   rdf99:or
> [ is x:setOf ([:rdf:Subject :John;  rdf:predicate :friend; :object
> :Jake.])].


> > Now retrieval for RDFOR can (probably) be designed so that
> > 1/ All the extra consequences involve special the RDFOR constructs, and so
> >    can be regarded as benign.
> 
> Before any deduction, yes.
> 
> > 2/ RDFOR is monotonic.
> 
> Yes.
> 
> > Have we succeeded?  Partly, but at three prices, two that show up right
> > away and one that shows up in other extensions.
> >
> > The first price is that the construction is much more complicated than
> > a syntax extension.
> 
> What does that mean?   You have made a construct which works in the abstract
> space, which does not require a syntax extension.  More complicated?
> Well, I suppose complexity is subjective. The simplicity is that the
> abstract
> space isn't changed.  You can then make some syntactic sugar to hide the
> compexity of the construct in real files.

I categorically state that the above is much more complicated.  It uses
reflection and other very powerful mechanisms to do something that can be
handled with a very small syntax extension.

> > The second price is that the construction adds a lot of extra
> consequences.
> > These consequences can be considered to be benign, but they are still
> > there.  To make the formalism work correctly in the presence of these
> > consequences requires a lot of work (and may not be possible, even here).
> 
> I suppose it consequences but they all involve the anonumous nodes F1, etc
> used to set the thing up.  I don't see why making it work is so tricky.

One problem is that to make this really work you have to mandate the presence
of the ``syntax'' in all interpretations.  If you use lists, then you have
to have all possible lists present in all interpretations.  This gets quite
tricky, particularly if you need lists that contain themselves, perhaps
indirectly. 

> > The third price is that we have introduced a form of reification and a
> > construct that can assert the truth of reification constructs.  This
> > (probably) doesn't cause any problems here because the extension is so
> > expressively limited.  However, for more powerful extensions reification
> > produces paradoxes, and thus cannot be used.
> 
> How do you mean "produces"?  Allows  one to write, or allows one to
> deduce from the empty formula?

``Produces'' in the sense of ``results in''.

The kind of paradoxes that you get here are of the flavour of:

Model theoretically:  All interpretations have to contain x (for some sort
		      of x), but x cannot be given a meaning.

Proof theoretically:  Not x can be derived from x (for some sort of x).

So, from the empty formula, you can deduce anything.

> In N3,
> { this a :Falsehood }.

If every N3 interpretation was required to contain all contexts, then you
would be in this paradox boat.

> It is will always possible to write a paradox on the semantic web. It is
> so important to be able to say what someone else is say is untrue that
> one cannot help but be able to say that what one is saying is untrue.
> That's life.  The question is whether the logic becomes unsound in that
> any DAML system which includes :or can then be used to proove anything.

Constructs are not paradoxes until they mess up the model theory or proof
theory.  If  
	{ this a :Falsehood }
causes breakdown, then it is a paradox, otherwise not.

> Being an amateur at this, I had understood that Goedel's theorem anyway
> demonstrated that any system of any basic power must necessarily have a
> sentence
> which was paradoxical.  So avoiding expressing a paradox is a waste of time.

This is *not* Goedel's incompleteness theorem.  What Goedel said is that
any sufficiently expressive logic can represent proof theories for itself.
So, given a sufficiently powerful logic and a proof theory for it, you can
write a statement whose meaning is

    I am not provable in this proof theory.

Which is true in the logic but not provable in this proof theory, unless
the proof theory is unsound.  (Why?  Well if the statement could be proved,
then it would be false, and thus the proof theory would be unsound.)
This is not, however, a paradox.

Thus any sound proof theory for a sufficiently expressive logic is
incomplete, not necessarily paradoxical.

Why is this whole story not a paradox?  Why can't you just add the
statement as an axiom, as you are supposed to be able to add any true fact
as an axiom, and come up with a paradox about proof theories.  Well if you
do add the statement then you have changed the proof theory and the
statement's meaning is now

   I am not provable in that other proof theory.

which does not cause a problem.

However, there is a new, true statement in the logic whose meaning is

   I am not provable in this new proof theory.

This excercise can be repeated ad nauseum.


> So you feel that allowing reification -- or, presumably, allowing
> sets of statements as first class objects in the syntax -- makes the system
> such that one can necessarily deduce a contradiction from nothing?
> Then I need help understanding how.

Look at the spec for KIF at http://logic.stanford.edu/kif/dpans.html.  The
section on Metaknowledge has a nice description of one such problem.

For another example, consider some old versions of set theory.  
They have a version of reification or reflection, namely having sets
contain other sets and letting the set-defining condition mention the
set inclusion relationship.  Then you get sets like

	the set of integers greater 10
	the set of sets with at most 10 members
	the set of sets with at least 1000 members

which are all OK.  However, you also get

	the set of sets that contain themselves

which is not OK.   The problem is that if this set contains itself, then it
does not contain itself.  However, if does not contain itself, then it
does.  You can use this to deduce anything, using the Law of the Excluded
Middle.

The problem, by the way, lies not just in having reification or sets of
statements, but instead in having a truth predicate that works on these
constructs.  Even then, if you don't need to have such constructs present
in interpretations you don't have problems, but, then you generally do need
this to make entailment (or deduction) work correctly.

> The normal way you do it relies on the Princple of the Excluded middle,
> which of course cannot be part of the semantic web.

Why not?  Well, of course, I don't want to deduce anything just because
some web page says X and some other web page says not X, but there are lots of
formalisms that don't have this property and that have the Law of the
Excluded Middle.

Don't throw away the Law of the Excluded Middle without seeing what you are
in for.  I spent some time working with relevance logics (that really throw
away the Law of the Excluded Middle) and they are incredibly difficult to
understand.  Other logics that reduce the power of the Law of the Excluded
Middle, like intuitionistic logics, are also difficult.

> > Why is there a problem?  The problem arises, as Pat has stated, from the
> > attempt to extend a representation formalism while not extending the
> > syntax.  This cannot be done in many circumstances and can only be easily
> > done in very limited circumstances.
> 
> I need to understand better where the problem lies.

Unfortunately, understanding the problem does require groking some
interesting logical notions.

peter

Received on Friday, 11 January 2002 20:16:04 UTC