Re: Reification as nesting from pat hayes on 2001-06-06 (www-rdf-logic@w3.org from June 2001)

From: pat hayes <phayes@ai.uwf.edu>
Date: Wed, 6 Jun 2001 15:44:29 -0500
To: Drew McDermott <drew.mcdermott@yale.edu>
Cc: www-rdf-logic@w3.org
Message-Id: <v04210144b74432634690@[205.160.76.219]>
>   [Pat Hayes]
>   ....I rather liked "nesting", which is
>   fairly free of mathematical/logical/linguistic baggage.
>
>My vote would be for "nesting" also.
>
>   There are two substantive things that nesting needs to be able to do.
>   First, it must provide a way to distinguish triples from assertions.
>   Some triples may be asserted, but the triples in a nesting aren't (at
>   least, not directly; something else might be able to infer them, or
>   something.).
>
>I've thought about this, following Jonathan Borden's proposals, and
>decided that it's a nonissue.  An expression, or set of triples, is
>asserted if someone asserts it (e.g., includes it at the top level of
>their web page).  In ordinary logical notation, you don't need to have
>a special system of marks to distinguish the "or" expression from the
>"and" expression in (or p (and q r)); the outer expression may or may
>not be asserted, depending on the context, but its being asserted
>doesn't imply that the inner and-expression is asserted.

I agree, but ordinary logical notation has the one thing that RDF 
conspicuously lacks, which is a nontrivial (recursive) syntax. RDF 
does not provide any notion of 'top level'; all its triples are at 
the same level. That is precisely the technical problem, seems to me: 
to find a way of encoding syntactic structure in triples (easy in 
itself) that also allows other triples to be asserted (needs a way to 
distinguish them from the structure-coding ones). The 'standard' RDF 
way of hiding unasserted content is to reify it and then refer to it, 
but you and I probably agree that that route has its problems, and I 
was suggesting a different way of doing it that tries to do as little 
harm as possible to current RDF practices and assumptions.

>
>   [quote pulled back from a later part of Pat's message on the same
>    topic:]
>   ... this idea ... would require some
>   work to reconstitute which triples in an arbitrary set of triples
>   were being asserted and which were not. That is, if some of this
>   'nested' RDF is simply rendered down into a set of isolated triples,
>   then the question of whether one of these rendered-down pieces is a
>   top-level (asserted) triple or not will depend on what other triples,
>   if any, are pointing to it.
>
>Again, I don't see it as a problem.

I also don't, but I have the impression from offline discussions with 
some W3C folk that it would be seen as a problem by some people. I 
just wanted to air the issue and see if it needed to be got onto the 
table. In any case there are lots of ways of fixing it if it is seen 
as a problem.

> First, if you render a nesting as
>a flat set of triples, you do not discard the nesting boundary,
>however represented.

True if you are sure of having all the triples. But if that boundary 
is implicit in the pointer structure, then you need to be sure that 
you have all the relevant triples before knowing which of them are 
asserted and which are not. I am happy with this, but I think it 
worries some of the RDF founders, and I can indeed see that it is an 
issue when you are collecting triples together in an open-ended way. 
It means that 'conjoining' more RDF information (ie adding more 
triples) is liable to alter the *syntax* of what you already have 
(and for example it might change an assertion into a mere disjunct, 
or even something that is negated), and avoiding just this kind of 
context-sensitivity is something that was, I believe, one of the 
design goals of RDF.

If one is thinking of the next layer up, of course, then this doesnt 
seem like an issue. One wouldnt expect a KIF parser, say, to go 
making wild assumptions before it had all the parts of the expression 
in place. But a KIF parser can tell when all the KIF pieces are in 
place and when they are not; whereas the humble RDF engine underneath 
may have no way to know if the next triple is an assertion for it, or 
should be passed to something higher. If it has to wait until it has 
only  the leftover triple-scraps from all the higher tables, it isn't 
going to be able to do much by itself at all.

>In fact, the way I read the RDF spec is that you
>*have* to render any expression as a flat set of triples.  The change
>we are discussing is to amend this to "sets of triples with nesting
>boundaries."

My proposal was to encode the nesting boundaries in the triples 
themselves, much as LISP Sexpressions are encoded in large 
collections of dotted pairs. (LISP doesnt use dotted pairs WITH 
nesting boundaries.) If the nesting is described by some other 
mechanism completely outside RDF, then of course it isnt RDF's 
problem.

>
>Second, the existence of a pointer from some unknown place to a given
>nesting doesn't change anything, at least, not in a way we have to
>worry about.  When manipulating a nesting, an agent will always know
>the path by which it reached the nesting,

Not in my scheme. An RDF agent can pick up any of the triples and 
look at it, since from RDFs perspective any bag of triples is 'flat'. 
That triple might be pointed to from somewhere else, but (unless we 
have two-way pointers or IPL5-style circular pointers or some other 
piece of datastructuring cleverness under the hood) there is no way 
in general of telling that it is being pointed to, other than by 
finding all the potential pointers and checking them. In other words, 
something is going to have to do what amounts to a garbage-check 
marking pass over the set of triples in order to find the nesting 
structure encoded in it, or even to distinguish the nested from the 
un-nested, ie the RDF-not-asserted from the RDF-asserted.

We could provide such a way, eg by requiring that all 
non-top-level-simple-RDF-assertion triples are distinctively marked 
(triples plus a marking bit). That would be enough for RDF to get on 
with its RDF business and leave the 'nest' triples to something else, 
at least, and be secure against making silly assertion errors that 
will need to be later corrected. (I was basically assuming that this 
was going to be done by some implicit encoding in the relation label, 
hence my 'fudging' you note below.)

>and so will know how to
>interpret the nesting in the context defined by that path.  E.g., if
>Tony Blair's web page has a pointer to the Labor Party's platform (or
>whatever they call a statement of principles in Britain), saying "This
>nesting is true," and you get to the platform from Blair's web page,
>then presumably what you care about is what it is Blair endorses.  The
>fact that there might be a pointer from the Conservative Party saying
>"Every (most? at least one?) item in this nesting is false" is of no
>particular relevance.  (Of course, there will always be agents who
>have wandered into a nesting from a useless direction, given their
>goals; but finding useful places on the web to wander from is not the
>issue here.)

Not sure I follow you here, but you seem to be talking about pointing 
as in URL references, but Im talking about something more internal to 
a single document.

>
>   Second, nesting needs to be recursive, so that one can
>   describe subexpressions. That is, a nest might have other nests
>   inside it.
>
>Yes.
>
>   Third, it must be possible to somehow label a nesting.
>
>I'm not sure I understand what "label" means here.

Give a name to, as when saying that a nesting is a KIF universal 
quantification, say. The chief RDF-ish reason this is needed is 
basically so that an RDF engine can tell where to send the thing to. 
If all that an RDF engine can do is to recognise a triple as not 
asserted, then there is going to be a rather nasty traffic jam 
between RDF and anything 'on top' of it. Labels can also be used by 
the other engines, of course, and they will probably want to use them 
for all kinds of purposes.

>
>   Seems to me that all this can be done in one fairly simple way, by
>   allowing the subject and object of RDF triples to themselves be RDF
>   triples (not reifications of triples, but actual triples.) These
>   'inner' triples are not asserted, and the 'verb' of the triple that
>   points to them provides the needed labelling.
>
>...and the needed contextual information.

Provided you have enough sense to only go along the subsequent list 
by starting at the 'top'; but we cannot make that assumption at the 
RDF level, I suspect.

>
>   The distinction between
>   subject and object provides the distinction between subnesting and
>   nesting, much in the way that LISP uses CDRs to indicate list members
>   and encodes sublists in the CAR.
>
>Do you really want to differentiate subject and object?  I thought you
>were going to say that it's the distinction between an atom (a
>resource) and a subtriple that indicated nesting.
>
>   Writing triples as [V s o], (I know this isnt the usual way round)
>   the following piece of KIF
>   (forall (?x)(implies (R ?x)(Q ?x a b)))
>   might look like this:
>   [Kif:forall  ?x [Kif:implies [R ?x .] [Q ?x [etc a b]]
>
>I think we still have some work to do here.  By casually changing the
>order of subject-verb-object,

I am taking the oft-repeated claim, that 'true' RDF is the triples 
graph and the SVO XML syntax is simply one possible lexicalization, 
very seriously here. :-)

> you've obscured the fact that this
>expression really looks like this:
>
>   <?x, Kif:forall, <<?x, R, .>, Kif:implies, <?x, Q, <a, etc, b>>>>
>
>But a vanilla RDF processor might ask what ?x refers to.  You try to
>ward off this possibility:
>
>   Ive inserted the "Kif:" in the spirit that it would be up to
>   something 'on top' of RDF to actually interpret these nestlings;
>   RDF's job is just to not think that they belong to it and are
>   actually being asserted.  The topmost triple of the nesting *is*
>   being asserted, but since its verb starts with "Kif:", RDF is warned
>   not to try to do anything with it, and so it would not attempt to
>   interpret it as a relation called "forall" applied to two arguments.
>
>but as RDF stands now it would have to assume at least that there were
>two entities being talked about, ?x and <<?x ...> ...>.  It's not at
>all clear what entity ?x might refer to.

Well, in fact it would be quite OK to have it refer to itself, in 
this case. But I agree, I am rather fudging this issue, and assuming 
a slightly more-than-vanilla processor; but then this issue is 
already often fudged in practice, particularly by the more 
enthusiastic RDF users. Let's draw a distinction between what RDF 
would have to assume and what the RDF spec says it ought to be 
assuming. I agree that the spec would need to be reworded somewhat to 
make this legal, but I think that it would be a relatively small and 
painless change. It is rather like importing a smidgeon of rdfs into 
rdf.

Another way to go would be to require that the 'top' (asserted) 
triple of any nest must have a special RDF-reserved label which marks 
it as being a nest or a context. I thought about this way of doing 
it, and we could go that way. Its uglier, IMHO, but it has the merit 
of making a sharp distinction between the triples that belong to 
RDF's logic and those that don't. However, there is a further step 
that I think we do not want to take, which would be to say that ANY 
assertion needs a top-level 'asserting' nest around it. That would 
really break the current RDF logical interpetation, rather than 
extend it.

>   where I have inserted a dummy dot to fill out the unwanted object of
>   the inner triple; these could be omitted by convention, of course.
>   'etc' means a continuation of whatever structure it occurs in, in
>   this case a relational sentence with more than two arguments. Again,
>   it would be natural to allow things like [... a b c d] as an
>   abbreviation for [....a [etc b [etc c d]]].
>
>I have a slight preference for having it abbreviate
>[... a [etc b [etc c [etc d nil]]]]
>but it doesn't make much difference.  It would if d could itself be an
>etc, but that's not possible, if I understand etc correctly.

Right, not possible. That was how I did it originally, being an 
ex-LISPer like yourself, but my way has the merit of leaving [R a b] 
alone, rather than having to write [R a [etc b nil]], and it will 
work just as well, I think.  One has to get used to a slightly 
different style of writing recursions, is all.  Notice you wouldnt 
write
.... c [etc d .] since that would just be [...c d], so in this style 
there is *never* a marker at the end of a list.

Pat

---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 6 June 2001 16:44:29 UTC