Re: Grist for layering discussion from Sandro Hawke on 2002-01-11 (www-archive@w3.org from January 2002)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 11 Jan 2002 10:42:06 -0500
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
cc: phayes@ai.uwf.edu, hendler@cs.umd.edu, timbl@w3.org, las@olin.edu, connolly@w3.org, w3c-semweb-ad@w3.org, www-archive@w3.org
Message-Id: <200201111542.g0BFg6f16276@wadimousa.hawke.org>
> As it appears that some participants in this debate prefer an operational
> view of the situation, let me try to couch my comments in an operational
> setting. 

Thank you.   I think the operational view is probably much more
comfortable to most of us with an IETF/W3C background.

> Suppose you want to build a representation formalism, like, say, RDF.  You
> design a syntax (and, maybe, an API) that lets you create information bases
> and you specify a retrieval mechanism that lets you see what you have
> created.  SQL is a not-very-good example of such as retrieval 
> mechanism for databases.  The retrieval mechanism wasn't really done for
> RDF, so lets just say that we can ask whether an information base contains
> a particular triple.  
> 
> Now you want to extend RDF.  OK, lets add mechanisms for creating a class
> taxonomy and for typing the domains and ranges of roles, and call it RDFS.
> We'll use the same syntax to create information bases, but modify the
> retrieval mechanism to make it correspond to the wording in the RDF Schema
> Candidate Recommendation.  Now you definitely get more out of the
> information base than you put in.  In particular, you get 
> 
> 	rdfs:Class rdfs:subClassOf rdfs:Resource .
> 
> out of an information base that has not had any information added to it.
> 
> So far so good.  

Yep.

> We have:
> 
> 1/ All RDF syntax is also RDFS syntax.
> 2/ Given an RDF information base, i.e., an information base created using
>    only the syntax in RDF, the RDFS retrieval mechanism produces everything
>    that the RDF retrieval mechanism would produce, and maybe more.
> 
> This is precisely what it means to be an extension.

That's an extension of functionality and sort-of of the output
synxtax, like when your bank statement now includes extra information
to help you compute your taxes.  The input syntax and/or the API have
not been changed.  Okay.

> However, RDF and RDFS have a very uncommon relationship because they also
> share a syntax.  Let's try to create a same-syntax extension of RDFS to
> encompass some of propositional logic, namely the part that allows us to
> create related disjunctions such as John is either married to Susan or
> friends with Jake. 
> 
> [Why do this extension in particular?  Well it is an extension that shows
> some of the problems, but it also has a construction that can be fairly
> naturally expressed as triples.]
> 
> What doesn't work is to directly encode the missing logical
> construction.  A direct encoding of this would be something like 
> 
> IB1	John rdfo:or _:x .
> 	_:x married Susan .
> 	_:x friend Jake .
> 
> We construct a formal specification of RDFO to incorporate this
> construction.  In particular, from
> 
> IB2	John rdfn:or _:x .
> 	_:x married Susan .
> 
> RDFO retrieval will produce
> 
> 	John married Susan .
> 
> Now is everything OK?  NO!  There are two problems:
> 
> 1/ Because RDFO is an extension of RDFS RDFO retrieval will also produce
> 
> 	_:x married Susan . 
>  
>    from IB2.  In fact, every RDFO disjunction also creates several
>    extraneous consequences.  
> 
>    Well you might argue that the 
> 
> 	John rdfn:or _:x .
> 
>    consequence is benign because it mentions the special RDFO property.
>    However, the other consequences do not mention any special RDFO
>    properties and they are definitely not benign.  RDFO has failed in its
>    goal of capturing related disjunctions.

Right.  Clearly that was a bad design.  The exact design principle
violated here I have not seen clearly stated, but it's important.  I
think TimBL didn't see this problem when he started n3 logic, which
makes it kind of broken in this same way.  But I believe he does
understand it now.   We usually talk about it in the realm of
open/closed world and "that's all their is", but perhaps it's a
different peice of the same problem.   (Which, as Pat pointed out, is
one of the heavy prices of moving from a serial syntax to a graph
syntax.   No debate there.) 

> 2/ RDFO is non-monotonic.  The retrievals from IB2 include information that
>    cannot be retrieved from IB1.
> 
> It is possible to overcome these problems, at least partly, by exploiting
> the reflective properties of RDF.  We can encode the disjunctions using a
> special construction, something like
> 
> IB3	John rdfor:or _:l1 .
> 	_:l1 rdfor:fact _:f1 .
> 	_:l1 rdfor:rest _:l2 .
> 	_:l2 rdfor:fact _:f2 .
> 	_:l2 rdfor:rest rdfor:nil .
> 	_:f1 rdfor:predicate married .
> 	_:f1 rdfor:object Susan .
> 	_:f2 rdfor:predicate friend .
> 	_:f2 rdfor:object John .

Yeah, this works.   It could be seen as going back to a serial syntax,
of course.

> Now retrieval for RDFOR can (probably) be designed so that 
> 1/ All the extra consequences involve special the RDFOR constructs, and so
>    can be regarded as benign.  
> 2/ RDFOR is monotonic.
> 
> Have we succeeded?  Partly, but at three prices, two that show up right
> away and one that shows up in other extensions.
> 
> The first price is that the construction is much more complicated than
> a syntax extension.  

Alas, yes, but that's just because you're looking at it in N-Triples.
LISP syntax is rather elegant unless you put all the dotted pairs back
in, then it's about this ugly.   Any object representation system is
pretty ugly at the bit level.

> The second price is that the construction adds a lot of extra consequences.
> These consequences can be considered to be benign, but they are still
> there.  To make the formalism work correctly in the presence of these
> consequences requires a lot of work (and may not be possible, even here).

Yes.   It will take some work make sure they stay benign.  

> The third price is that we have introduced a form of reification and a
> construct that can assert the truth of reification constructs.  This
> (probably) doesn't cause any problems here because the extension is so
> expressively limited.  However, for more powerful extensions reification
> produces paradoxes, and thus cannot be used.  

Two answers here.

1.  I've heard some people say, "Who Cares?"  Operationally, what's
the problem with a paradox?  My guess is it will show up as infinite
loops and/or bottomless recursion, which is unpleasant but can be
managed as a resource-management problem.  That is, in theory there's
a huge difference between a paradox and a problem that will simply
take 4 hours to terminate, but operationally they're both just systems
that go off into the weeds.  The user presses "stop" and everything's
fine again.

2.  I don't really like systems going off into the weeds.  And I don't
see why they have to, if we're careful about the feedback loop.  That
is: reasoners should not look for their inputs in their own outputs.
Can that loop be avoided if there are two reasoners...?  Hm.  I think
so, but it might be expensive.

If you still say that wont work, is there some system I can construct
(in code or just detailed specification) to demonstrate it will?  Like
finishing up my FOL-encoded-in-RDF system?  If I can have RDFS and FOL
reasoners properly attached to the same database, would that be
convincing?  (That's kind of a silly case since RDFS can be done with
FOL axioms, but I wouldn't implement it that way here.  Maybe some
floating-point math thing.)

    -- sandro
Received on Friday, 11 January 2002 10:45:22 UTC