Re: Grist for layering discussion from Pat Hayes on 2002-01-10 (www-archive@w3.org from January 2002)

From: Pat Hayes <phayes@ai.uwf.edu>
Date: Thu, 10 Jan 2002 13:10:45 -0600
To: Sandro Hawke <sandro@w3.org>
Cc: Jim Hendler <hendler@cs.umd.edu>, timbl@w3.org, las@olin.edu, connolly@w3.org, w3c-semweb-ad@w3.org, <www-archive@w3.org>, "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Message-Id: <p05101012b863790183bc@[65.212.118.208]>
>[ I'm replying to Pat's message to Jim, interloper that I am.   Of
>course this conversation should be copied to a public list, soon. ]

I'm following Aaron's lead and Ccing it to  <www-archive@w3.org>, OK?

>[ Pat's wonderfully clear explanation trimmed... ]
>
>>  >we will introduce a blank node to 'fill out' the subject and object
>>  >positions, and thereby transcribe the new expressions as sets of
>>  >triples, as follows:
>>  >
>>  >a rdfn:notAll b1 c1 b2 c2 b3 c3....bn cn;
>>  >
>>  >maps into
>>  >
>>  >  a rdfn:notAll _:x
>  > >_:x b1 c1 .
>>  >_:x b2 c2 .
>>  >....
>  > >_:x bn cn .
>>  >
>>  >and we rely on the connection between _:x and the property name in
>>  >the first triple to encode the intended meaning. So now our extended
>>  >RDFN syntax has been encoded as RDF triples, and so RDFN can be
>>  >parsed by an RDF parsing engine, and the universal nature of the
>>  >triples graph has been once more vindicated, right?
>>  >
>>  >Unfortunately, this doesn't work. The problem is not that there is
>>  >anything wrong with the transcription into triples considered as a
>>  >datastructuring or implementation exercise. (There are a few
>>  >niggles, eg how do you know that you have got all the triples? ...
>>  >things like that, but lets agree to leave those out of the
>>  >discussion.) The problem is that RDF assigns a meaning to those
>>  >triples, and that meaning is incompatible with the meaning that RDFN
>>  >assigns to rdfn:notAll, which was, you recall, that at least one of
>>  >some set of triples is false.  That clearly is not what the above
>>  >RDF 'translation' says: it asserts that all of some other set of
>>  >triples are true.
>
>From those assertions, an RDFN reasoner can, of course, infer the
>intended negation in its more-expressive-than-RDF internal knowledge
>base.  This is just "perl-style" layering,

No, its not; see below for why. (That its not perl-style layering is 
my main point.)

>not all that different from
>    a rdfn:notAllString "c1 c2 ... cn" .

Right, an RDFN reasoner can be happy. The problem is that an RDF 
reasoner can also look at these triples, and would be entitled to 
draw conclusions from them, and those conclusions could mislead the 
RDFN reasoner. For example, it follows in RDF from the above that
_:x b1 c1 .
_:x b2 c2 .
....
_:x bn cn .
ie that something exists which has the bi relation to ci.  That is a 
valid RDF-consequence from those triples; but it is *not* a valid 
RDFN consequence from

a rdfn:notAll b1 c1 ...bn cn;

The mapping of the RDFN syntax into RDF syntax has said some stuff in 
RDF that isn't what the RDFN says. Since RDFN is supposed to be an 
extension of RDF, valid RDF consequences are also valid RDFN 
consequences, so the RDFN engine ought to accept those RDF 
conclusions. But now it will get its knickers in a twist, eg suppose 
you assert (in RDFN):

_:x rdfn:notAll b c;

then it follows in RDFN that _:x b c . is false. But it follows in 
RDF, from the RDF encoding, that it is true.

>So when you say "this doesn't work", what do you mean?  I think it's a
>little ugly, but it will work just fine in real-world running
>software.  Not buggy fly-by-night software, but real long-term
>interoperable stake-your-money-on-it software

No, it won't, if the software has to respect the specs, and those 
specs include model-theoretic semantics. Once you publish some 
triples, then any RDF engine anywhere on the planet can stick them 
together with some other RDF triples that you have never even seen, 
and is entitled to draw any valid conclusions from them. Its no good 
your saying that you only meant them to be used some particular way; 
you can't say that in RDF, and once the triples are loose on the web, 
you have no control over them.  Maybe your RDFN engine will work 
fine, but some other engine might do all kinds of crazy stuff while 
still satisfying the RDF specs, and then whose fault will that be?. 
The point is that assertional languages with defined semantics are 
*not* Perl-style programming languages. Once you publish a semantics, 
you can't over-ride that semantics to serve a higher-level 
implementation trick. Each RDF triple is an assertion which stands on 
its own, and if you assert it, then you take the rap for any valid 
conclusions that can be drawn from it.

>  It seems like you're
>just saying it's broken with respect to the model theory.  But how do
>those extra asserted triples contradict anything?

See above. But in any case, the point is if they entail *anything* 
that isn't entailed by the RDFN assertion they encode, then something 
is wrong. That allows an RDF engine to disagree with an RDFN engine 
about what some piece of RDFN means, and that means that RDFN is not 
an extension of RDF. This is exactly the state we are in with RDF(S) 
and DAML right now, by the way.

>  They are irrelevant
>to all but the inference which recognizes/reconstructs the "notAll"
>sentence.

We really have no idea what conclusions they might engender. Once you 
drop an assertion into a KB, you have very little control over what 
its consequences might be. As a simple example, suppose that some 
engine decides to count employees by counting triples that use an 
'employed' property, and it hits an RDF transcription of an 
rdfn:notAll, and counts all those ci's as employees. That could lead 
to an unexpected tax bill, or worse.

>
>>  (It also asserts that some thing exists which
>>  >bears no relation to anything mentioned in the RDFN assertion, by
>>  >the way, which may well cause its own inferential problems if taken
>>  >seriously by an RDF reasoner.)
>
>No, it is not allowed to cause problems.

Howrya gonna disallow it? From the RDF perspective, there is no way 
to distinguish these triples from any others, so even if there was 
some way to say what was or was not allowed, an RDF reasoner couldn't 
do anything about it.

>Extra triples (as you say in
>the third paragraph of the RDF model theory) do not affect the meaning
>of a graph.

? But they do (of course) affect what conclusions can be drawn from 
the graph, right? At a minimum, they allow the extra triples to be 
inferred themselves. Look, if I say (P and Q) then saying P doesn't 
affect the meaning of Q; but that doesn't mean that saying (P and Q) 
is the same as saying Q by itself.

>This is a day-one fundamental principle of RDF.
>
>>  Well, you might respond, lets find a
>>  >different way to encode RDFN into RDF triples, one that doesn't have
>>  >this problem.  BUt the point is that this task is literally
>>  >*impossible*; there is no such translation, and there cannot
>>  >possibly be, since there is *no* way to express negation (or
>>  >disjunction or implication or universal quantification) in RDF, (the
>>  >proof is trivial), so we really should have known that we cannot
>>  >translate this kind of assertion into *any* set of RDF triples, no
>>  >matter what its syntactic form is. We can transcribe the syntax into
>>  >RDF, but we cannot capture the content in RDF (while also conforming
>>  >to the RDF semantics).
>
>Well, sure.  Of course.  We can transcribe the syntax of FOL into RDF
>(an excersize I undertook a few weeks ago [1]).  The added triples
>will be meaningful in the model-theoretic sense I suppose, but only in
>a meaningless sort of way, talking about anonymous objects.

No no no. You have no such warrant for that claim. In RDF, all 
triples are meaningful in exactly the same way, and the model theory 
applies to them all. There are no 'added' triples; they are all just 
triples. If you add triples that mean something and you don't want or 
intend them to mean that, its no good just kind of hoping that their 
meaning will go away.

>  Only a
>reasoner who knows the FOL vocabulary can do anything with the
>triples.

An RDF reasoner can draw any valid RDF conclusion from them. That is 
exactly the point. Now, are you sure that all such RDF-valid 
conclusions are still FOL-valid (using the same syntactic embedding)?

>To it, RDF containing the right kind of transcription of FOL
>means to same thing as FOL.

RDF means what the RDF spec (which includes the model theory) says it 
means. If something else wants it to mean something else, that that 
something is misusing the RDF spec. RDF is not just a de-serialised 
Perl; it is an assertional language with a precisely defined 
assertional meaning.

>
>This is pretty much the same situation as with XML (or non-strict
>HTML).

No it isn't, precisely because XML does not have an assertional semantics.

>  If you don't recognize the vocabulary of an element, you must
>ignore it.  But that's hard to really do in serial formats, where you
>might ask how many children an element has -- are you counting the
>ones you're suppose to ignore?  This is one advantage of RDF over XML.

Well, I disagree with you there, but let's take that discussion onto 
a different thread.

>
>So I'm arguing transcription/encoding is fine.   Which is all DAML
>does to my model-theory-blind eye.
>
>I see two real issues:
>
>1.  What if your transcribed NotAll expression says that one of the
>triples used to transcribe it is false?

As it does in the example above.

>  Actually that's not a
>problem; it's just a contradiction.

No, it is a problem. It shows that the RDFN meaning of a statement 
and the RDF meaning of the transcription of that statement can be in 
direct contradiction. That is not a contradiction, but an 
incompatibility between the languages.

>The NotAll reasoner will view the
>graph as not satisfiable (invalid), just as it would if some more
>normal triple had been contradicted.

But there is no contradiction here either in RDFN or in RDF. They 
just disagree.

>
>2.  The "and that's all" problem you mention above (which I also
>discussed earlier [2]) interacting with these layered logics.

Right, this seems to me to be the chief argument *against* the use of 
a simple graph relational model to transcribe syntax. Syntax really 
is intrinsically finite, recursive and ordered; it seems damn silly 
to pretend that it isn't, and then get involved trying to 'solve' all 
the 'problems' that result from this pretence, when they weren't 
problems in the first place.

Heres where McCarthy's old notion of abstract syntax really belongs, 
seems to me.  A syntactic construct has a syntactic class name and a 
finite structure of other syntactic constructs (which may be ordered 
or not, whatever; choose your algebra to suit) as immediate 
constituents.  The finite solutions of the resulting recursion are 
the expressions of the language. Simple, universal, elegant, 
efficient(assuming simple algebras). Easy to transcribe into XML. No 
theoretical problems. The only implementation issue is that the DPL 
has to learn about stacks.

>For
>instance, DAML is not encoded with "and that's all" encoding, so one
>can imagine a DAML reasoner inferring triples which affect its own
>reasoning, like making a daml:equivalentTo for the term daml:first.
>Can this result in anything worse than hard-to-find errors?  I don't
>know.  cwm does this for breakfast, but it's a forward chainer.
>
>When this concern makes me nervious, I want one or both of these
>restrictions, which are kind of the same thing:
>
>    a.  don't have a recognize-infer loop.   Turn your transcribed FOL
>        into internal FOL, and do inferences, but if infer more
>        transcribed FOL, ignore it.   I don't see this as a big loss;
>        I'm still looking for where this would be useful.
>
>    b.  transcribe your logical expressions in a closed manner, so any
>        addition to your formula is a clear contradiction.   This may
>        not be possible.

Im afraid you have lost me here. I am not following your terminology. 
(Suppose the logical expression is a disjunction?)

>  > >The only thing we give up is the idea that RDF, alone among
>>  >languages, is somehow anointed with the double crown of being
>>  >simultaneously a universal syntactic encoding language and, at the
>>  >same time, a universal semantic base for all assertional extensions.
>>  >But that was a damn silly idea anyway.
>
>Assuming we all know you can't build a NOT-gate out of AND-gates and
>OR-gates (but you can out of just NAND-gates, as you point out), yeah,
>it's damn silly to think RDF is "a universal semantic base for all
>assertional extensions" in that sense.  But it is a better base than
>XML because of its graph syntax (so you ignore/merge stuff) and its
>direct relational model (instead of XML's attributes and children,
>which might be ordered, and random other complications).

Again, I fail to follow why you think that graph syntax is better 
than tree syntax for describing syntactic structure (which is what 
XML is for, as I understand it). BUt lets take that onto a different 
thread.

But the main point is that comparing RDF to XML seems to me like 
comparing apples and oranges. If RDF is supposed to be playing the 
role of a kind of unordered version of XML, then RDF shouldn't have 
been presented to the world as a 'resource *description* format' in 
the first place. It has always been described as a simple assertional 
language, not as a syntax-description format. I am puzzled why 
pointing out that these are somewhat different roles that can't be 
done simultaneously - a fact which ought to be kind of obvious, I 
would think, once it is said clearly  -  should meet with such 
opposition.

>
>So what am I missing?  What makes this kind of
>transcription/recognition layering, as shown in DAML, not work?

I don't know how I can explain it more clearly.

>  > >Moral. RDF can be either an assertional language or a universal
>>  >graph-encoding language. But it can't be both at the same time.
>
>Why can't I say:
>       The sun is shining.
>       MSFT is down 3.75 points
>       All you FOL reasoners, "when the sun is shining, it's not raining."
>
>and from this, some processors will know it's not raining, and others
>wont.

Well, for a start, that doesn't follow from what you said, since you 
quoted it, so *no* valid reasoner should be able to infer it. But 
leaving that old use/mention thing aside for now, the other important 
point about this is that you specifically said, 'all you FOL 
reasoners'. But that 'labelling' of some part of what you are saying 
as being aimed specifically at some processors and not at others, is 
exactly what one cannot do in RDF.  RDF just has triples, and all 
triples are on the same level, and all triples are asserted. It 
doesn't have contexts, or scopes, or brackets, or any kind of 
recursive sub-sub-structure that could be used to encode this kind of 
distinction. It wouldn't take much to do it (see below), but the fact 
is that it doesn't have it right now.

>
>>  >Just so you don't think Im just a naysayer: I don't think it will be
>>  >hard to jury-rig a quick fix that will enable work to go ahead. For
>>  >example, if we can introduce a simple distinction into RDF between
>>  >triples that assert and triples that encode then that will avoid the
>>  >immediate problem.  Ive put a hook into the MT to do this. It's ugly
>>  >but it works. But if we just ignore this issue, then it is going to
>>  >rise up and bite us very soon, so we do need to do something.
>
>Is the hack just in the MT, or does it affect, say, N-Triples?

N-triples doesn't have any way to snag my hook right now. It wouldn't 
be hard to add it, eg suppose we allowed some triples to be 
terminated by a semicolon rather than a dot, and then the rule would 
be that only the dotted triples were understood to be asserted in 
RDF, and the others were being used for some other purpose (known 
only to some higher-level processor.) Like I said, its ugly, kind of 
like a plumber using stop-leak, but it would stop the leaks.

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Thursday, 10 January 2002 14:10:15 UTC