Re: Grist for layering discussion from Sandro Hawke on 2002-01-12 (www-archive@w3.org from January 2002)

From: Sandro Hawke <sandro@w3.org>
Date: Fri, 11 Jan 2002 20:50:01 -0500
To: Pat Hayes <phayes@ai.uwf.edu>
cc: hendler@cs.umd.edu, timbl@w3.org, las@olin.edu, connolly@w3.org, w3c-semweb-ad@w3.org, www-archive@w3.org, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Message-Id: <200201120150.g0C1o1423495@wadimousa.hawke.org>
> I wonder if I could interject here. I think it is bad rhetoric to use 
> this sort of 'everyday' example, because they all are taken from 
> transactions between human beings, and are therefore fundamentally 
> misleading at exactly the place where the semantic problems arise. 
> The relationship between RDF and RDFS that Peter was describing is 
> NOT like ANYTHING printed on a bank statement, or in a book, or on a 
> webpage, or said by someone to someone else. It is a relationship 
> between formal systems.

I understand.  In general, I agree with this practice.   Here it
seemed okay to do otherwise.

> >  The input syntax and/or the API have
> >not been changed.  Okay.
> >
> >>  However, RDF and RDFS have a very uncommon relationship because they also
> >>  share a syntax.  Let's try to create a same-syntax extension of RDFS to
> >>  encompass some of propositional logic, namely the part that allows us to
> >>  create related disjunctions such as John is either married to Susan or
> >>  friends with Jake.
> >>
> >>  [Why do this extension in particular?  Well it is an extension that shows
> >>  some of the problems, but it also has a construction that can be fairly
> >>  naturally expressed as triples.]
> >>
> >>  What doesn't work is to directly encode the missing logical
> >>  construction.  A direct encoding of this would be something like
> >>
> >>  IB1	John rdfo:or _:x .
> >>	_:x married Susan .
> >>	_:x friend Jake .
> >>
> >>  We construct a formal specification of RDFO to incorporate this
> >>  construction.  In particular, from
> >>
> >>  IB2	John rdfn:or _:x .
> >>	_:x married Susan .
> >>
> >>  RDFO retrieval will produce
> >>
> >>	John married Susan .
> >>
> >>  Now is everything OK?  NO!  There are two problems:
> >>
> >>  1/ Because RDFO is an extension of RDFS RDFO retrieval will also produce
> >>
> >>	_:x married Susan .
> >> 
> >>     from IB2.  In fact, every RDFO disjunction also creates several
> >>     extraneous consequences. 
> >>
> >>     Well you might argue that the
> >>
> >>	John rdfn:or _:x .
> >>
> >>     consequence is benign because it mentions the special RDFO property.
> >>     However, the other consequences do not mention any special RDFO
> >>     properties and they are definitely not benign.  RDFO has failed in its
> >>     goal of capturing related disjunctions.
> >
> >Right.  Clearly that was a bad design.  The exact design principle
> >violated here I have not seen clearly stated, but it's important.  I
> >think TimBL didn't see this problem when he started n3 logic, which
> >makes it kind of broken in this same way.  But I believe he does
> >understand it now.   We usually talk about it in the realm of
> >open/closed world and "that's all their is",
> 
> That is a different issue. Don't get them confused. The 'that's all' 
> problem is to do with how to encode finite structures in purely 
> descriptive format. [The honest answer is that it can't be done 
> (because of the second recursion theorem), but there are ways to hack 
> it if your tastes run to hacking.]  But that has nothing (directly) 
> to do with the central issue being talked about here, which is that 
> when you DO describe the syntax, you are making assertions that are 
> different from what that syntax was making.

But the assertions are merely assertions about syntactic structures.
They don't say anything about anything except about these objects in
the domain of discourse which happen to be syntactic structures of
other languages.  And one of them probably says one of those structure
is true.

> >but perhaps it's a
> >different peice of the same problem.   (Which, as Pat pointed out, is
> >one of the heavy prices of moving from a serial syntax to a graph
> >syntax.   No debate there.)
> >
> >>  2/ RDFO is non-monotonic.  The retrievals from IB2 include information th
> at
> >>     cannot be retrieved from IB1.
> >>
> >>  It is possible to overcome these problems, at least partly, by exploiting
> >  > the reflective properties of RDF.  We can encode the disjunctions using 
> a
> >>  special construction, something like
> >>
> >>  IB3	John rdfor:or _:l1 .
> >>	_:l1 rdfor:fact _:f1 .
> >  >	_:l1 rdfor:rest _:l2 .
> >>	_:l2 rdfor:fact _:f2 .
> >>	_:l2 rdfor:rest rdfor:nil .
> >>	_:f1 rdfor:predicate married .
> >>	_:f1 rdfor:object Susan .
> >>	_:f2 rdfor:predicate friend .
> >  >	_:f2 rdfor:object John .
> >
> >Yeah, this works.   It could be seen as going back to a serial syntax,
> >of course.
> 
> No, it does NOT work. If you publish that as RDF, you are not saying 
> what the RDFOR was saying. There is no way to get around this fact, 
> since the RDF spec itself includes the RDF MT, so the meaning of that 
> RDF is *required by the W3C spec* to be what it is. You might have 
> something else in mind; but that is irrelevant, since once those RDF 
> triples are set loose on the web, your state of mind or intentions 
> are lost; something reading this only has the triples and the RDF 
> spec to go on. And with RDF in its current state, there is no way to 
> indicate *in RDF* that you mean it to say anything other than what it 
> says in RDF.
> 
> >  > Now retrieval for RDFOR can (probably) be designed so that
> >>  1/ All the extra consequences involve special the RDFOR constructs, and s
> o
> >>     can be regarded as benign. 
> >>  2/ RDFOR is monotonic.
> >>
> >>  Have we succeeded?  Partly, but at three prices, two that show up right
> >>  away and one that shows up in other extensions.
> >>
> >>  The first price is that the construction is much more complicated than
> >>  a syntax extension. 
> >
> >Alas, yes, but that's just because you're looking at it in N-Triples.
> >LISP syntax is rather elegant unless you put all the dotted pairs back
> >in, then it's about this ugly.
> 
> Not quite, in fact. But look: it *is* Ntriples. There isn't any 
> 'other way' to look at it (except RDF/XML, ie). This is the level at 
> which the meaning is attached. You may have in mind that it is being 
> used for some other purpose, but that's not what the published spec. 
> says.
> 
> >  Any object representation system is
> >pretty ugly at the bit level.
> >
> >>  The second price is that the construction adds a lot of extra consequence
> s.
> >>  These consequences can be considered to be benign, but they are still
> >>  there.  To make the formalism work correctly in the presence of these
> >>  consequences requires a lot of work (and may not be possible, even here).
> >
> >Yes.   It will take some work make sure they stay benign.
> 
> I don't see how this can be possible. Who knows what consequences 
> they might have in some other context, eg when added to some other 
> set of triples from some other source? There certainly could be no 
> way to guarantee that they might not, for example, accidentally 
> combine to be an encoding of some other higher-level syntax (RDFX or 
> RDFY) which could mean something else completely. When lists are 
> decomposed into sets of triples, and any set entails all its subsets, 
> and subsets can be combined freely, the whole idea of using triple 
> stores as datastructure encodings strikes me as highly dubious.
> 
> >
> >>  The third price is that we have introduced a form of reification and a
> >>  construct that can assert the truth of reification constructs.  This
> >>  (probably) doesn't cause any problems here because the extension is so
> >>  expressively limited.  However, for more powerful extensions reification
> >>  produces paradoxes, and thus cannot be used. 
> >
> >Two answers here.
> >
> >1.  I've heard some people say, "Who Cares?"  Operationally, what's
> >the problem with a paradox?
> 
> Oh dear God, I am inclined to give up at this point. Why don't y'all 
> try making a semantic web which is freely paradoxical, and we go away 
> and make one that preserves meaning, and we just see which of them is 
> more use?

How about you make one by publishing model theories and we make one
with layman's language and running code?   Seriously, I appreciate
your efforts, and I'm sure you know I (and others) are trying to find
ways we can reach a consensus design that's better than any of us
could do alone.

> >  My guess is it will show up as infinite
> >loops and/or bottomless recursion, which is unpleasant but can be
> >managed as a resource-management problem.
> 
> No. It will show up as two pieces of software accessing the same DB 
> but one deciding that you owe the bank $2000 and the other deciding 
> that the bank owes you $3000, and they are *both right*. That is, 
> they are both conforming to published specs which are supposed to 
> guarantee that meaning is preserved, and they both use logically 
> secure methods, they both have checked their proofs, and both proofs 
> are guaranteed to be correct, and they use the same premises; but 
> they disagree. That is what happens when DB reasoning hits paradoxes. 
> That could happen right now if one of then uses RDF and the other 
> uses DAML+OIL and they follow the letter of the published specs.
> 
> The point is that we are not here worrying about whether or not the 
> software terminates. This isn't to do with computability; its to do 
> with what the actual data *means*.
>
> >  That is, in theory there's
> >a huge difference between a paradox and a problem that will simply
> >take 4 hours to terminate, but operationally they're both just systems
> >that go off into the weeds.  The user presses "stop" and everything's
> >fine again.
> 
> There is so much wrong with this that Im at a loss to cover it all. 
> First, please stop talking in kiddie metaphors ("weeds" can mean 
> nontermination, bad data, bad reasoning, goodness knows what else). 

Sorry, around these parts, I think it's a well understood term meaning
(as Microsoft would say) the system (or application) stops responding.
It's a user-experience term.  Software can fail in two ways for a
user: not responding or wrong results (if we lump in reporting an
error when it should not, and crashing, as sorts of wrong results).
You're right that wrong results is worse than not responding. 

> Second, paradox doesn't mean nontermination, cf above. Third, there 
> is no user to press "stop" on the SW, right? (Isn't that the whole 
> point of the SW, as opposed to the WWW? Thats why the 'A' in 'DAML' 
> is from 'agent'.)

Agent behavior can be connected back to users in many situations.

> Fourth, the issue is not stopping, but what to do 
> with contradictory information that isn't in fact a real 
> contradiction but arises from a contradictory specification.

Yes.  I understand that problem.  I fail to see why it's unavoidable
with my design.

> >2.  I don't really like systems going off into the weeds.  And I don't
> >see why they have to, if we're careful about the feedback loop.
> 
> What feedback loop? (What the hell are you talking about?)
> 
> >  That
> >is: reasoners should not look for their inputs in their own outputs.
> 
> 1. Who said anything about this??

I'm trying to have a discussion about operational, observable aspects
of a system we are trying to design.  I find that kind of discussion
much more productive in reaching consensus, and the history of the W3C
and IETF supports this approach.

> 2. Why should they not, in any 
> case? If a reasoner uses valid reasoning processes to draw a 
> conclusion, then why should it not go on to use that conclusion to 
> draw other conclusions? (Ever hear of forward reasoning?)

You know I've heard of forward reasoners, and I think we've even
talking about some forward and backwards reasoners I've written.

Here's the loop I'm talking about, which is different from normal
chaining:

   1. RDF triples describe FOL syntactic structures
   2. Those syntactic structures are extracted and conjoined
      with the simple structures in the RDF itself.
   3. FOL reasoning is performed   [ with FOL it's a little unclear
      what the goal might be, but I think that's not relevant here ]
   4. That certain new RDF triples are true may be inferred; these
      triples are made available to querying clients.   BUT IF YOU PUT
      THEM BACK IN (1), where they are scanned again for new FOL
      syntactic structures, THEN you raise the spectre of the truth
      predicate and paradoxes. 

> >Can that loop be avoided if there are two reasoners...?  Hm.  I think
> >so, but it might be expensive.
> >
> >If you still say that wont work, is there some system I can construct
> >(in code or just detailed specification) to demonstrate it will?  Like
> >finishing up my FOL-encoded-in-RDF system?  If I can have RDFS and FOL
> >reasoners properly attached to the same database, would that be
> >convincing?
> 
> That would convince me that I was right, since those two reasoners 
> couldn't possibly draw the same conclusions from the DB.

I'm saying such a system could be created that would not produce
incorrect results.   (Incorrect in layman's terms, like the
$2000/$3000 error.)

So if I did it, the burden would be on you to provide some input which
would prodce obviously wrong outputs.  And you are sure you could do
it easily.  Right?

    -- sandro
Received on Friday, 11 January 2002 20:53:03 UTC