- From: pat hayes <phayes@ai.uwf.edu>
- Date: Wed, 6 Jun 2001 15:44:29 -0500
- To: Drew McDermott <drew.mcdermott@yale.edu>
- Cc: www-rdf-logic@w3.org
> [Pat Hayes] > ....I rather liked "nesting", which is > fairly free of mathematical/logical/linguistic baggage. > >My vote would be for "nesting" also. > > There are two substantive things that nesting needs to be able to do. > First, it must provide a way to distinguish triples from assertions. > Some triples may be asserted, but the triples in a nesting aren't (at > least, not directly; something else might be able to infer them, or > something.). > >I've thought about this, following Jonathan Borden's proposals, and >decided that it's a nonissue. An expression, or set of triples, is >asserted if someone asserts it (e.g., includes it at the top level of >their web page). In ordinary logical notation, you don't need to have >a special system of marks to distinguish the "or" expression from the >"and" expression in (or p (and q r)); the outer expression may or may >not be asserted, depending on the context, but its being asserted >doesn't imply that the inner and-expression is asserted. I agree, but ordinary logical notation has the one thing that RDF conspicuously lacks, which is a nontrivial (recursive) syntax. RDF does not provide any notion of 'top level'; all its triples are at the same level. That is precisely the technical problem, seems to me: to find a way of encoding syntactic structure in triples (easy in itself) that also allows other triples to be asserted (needs a way to distinguish them from the structure-coding ones). The 'standard' RDF way of hiding unasserted content is to reify it and then refer to it, but you and I probably agree that that route has its problems, and I was suggesting a different way of doing it that tries to do as little harm as possible to current RDF practices and assumptions. > > [quote pulled back from a later part of Pat's message on the same > topic:] > ... this idea ... would require some > work to reconstitute which triples in an arbitrary set of triples > were being asserted and which were not. That is, if some of this > 'nested' RDF is simply rendered down into a set of isolated triples, > then the question of whether one of these rendered-down pieces is a > top-level (asserted) triple or not will depend on what other triples, > if any, are pointing to it. > >Again, I don't see it as a problem. I also don't, but I have the impression from offline discussions with some W3C folk that it would be seen as a problem by some people. I just wanted to air the issue and see if it needed to be got onto the table. In any case there are lots of ways of fixing it if it is seen as a problem. > First, if you render a nesting as >a flat set of triples, you do not discard the nesting boundary, >however represented. True if you are sure of having all the triples. But if that boundary is implicit in the pointer structure, then you need to be sure that you have all the relevant triples before knowing which of them are asserted and which are not. I am happy with this, but I think it worries some of the RDF founders, and I can indeed see that it is an issue when you are collecting triples together in an open-ended way. It means that 'conjoining' more RDF information (ie adding more triples) is liable to alter the *syntax* of what you already have (and for example it might change an assertion into a mere disjunct, or even something that is negated), and avoiding just this kind of context-sensitivity is something that was, I believe, one of the design goals of RDF. If one is thinking of the next layer up, of course, then this doesnt seem like an issue. One wouldnt expect a KIF parser, say, to go making wild assumptions before it had all the parts of the expression in place. But a KIF parser can tell when all the KIF pieces are in place and when they are not; whereas the humble RDF engine underneath may have no way to know if the next triple is an assertion for it, or should be passed to something higher. If it has to wait until it has only the leftover triple-scraps from all the higher tables, it isn't going to be able to do much by itself at all. >In fact, the way I read the RDF spec is that you >*have* to render any expression as a flat set of triples. The change >we are discussing is to amend this to "sets of triples with nesting >boundaries." My proposal was to encode the nesting boundaries in the triples themselves, much as LISP Sexpressions are encoded in large collections of dotted pairs. (LISP doesnt use dotted pairs WITH nesting boundaries.) If the nesting is described by some other mechanism completely outside RDF, then of course it isnt RDF's problem. > >Second, the existence of a pointer from some unknown place to a given >nesting doesn't change anything, at least, not in a way we have to >worry about. When manipulating a nesting, an agent will always know >the path by which it reached the nesting, Not in my scheme. An RDF agent can pick up any of the triples and look at it, since from RDFs perspective any bag of triples is 'flat'. That triple might be pointed to from somewhere else, but (unless we have two-way pointers or IPL5-style circular pointers or some other piece of datastructuring cleverness under the hood) there is no way in general of telling that it is being pointed to, other than by finding all the potential pointers and checking them. In other words, something is going to have to do what amounts to a garbage-check marking pass over the set of triples in order to find the nesting structure encoded in it, or even to distinguish the nested from the un-nested, ie the RDF-not-asserted from the RDF-asserted. We could provide such a way, eg by requiring that all non-top-level-simple-RDF-assertion triples are distinctively marked (triples plus a marking bit). That would be enough for RDF to get on with its RDF business and leave the 'nest' triples to something else, at least, and be secure against making silly assertion errors that will need to be later corrected. (I was basically assuming that this was going to be done by some implicit encoding in the relation label, hence my 'fudging' you note below.) >and so will know how to >interpret the nesting in the context defined by that path. E.g., if >Tony Blair's web page has a pointer to the Labor Party's platform (or >whatever they call a statement of principles in Britain), saying "This >nesting is true," and you get to the platform from Blair's web page, >then presumably what you care about is what it is Blair endorses. The >fact that there might be a pointer from the Conservative Party saying >"Every (most? at least one?) item in this nesting is false" is of no >particular relevance. (Of course, there will always be agents who >have wandered into a nesting from a useless direction, given their >goals; but finding useful places on the web to wander from is not the >issue here.) Not sure I follow you here, but you seem to be talking about pointing as in URL references, but Im talking about something more internal to a single document. > > Second, nesting needs to be recursive, so that one can > describe subexpressions. That is, a nest might have other nests > inside it. > >Yes. > > Third, it must be possible to somehow label a nesting. > >I'm not sure I understand what "label" means here. Give a name to, as when saying that a nesting is a KIF universal quantification, say. The chief RDF-ish reason this is needed is basically so that an RDF engine can tell where to send the thing to. If all that an RDF engine can do is to recognise a triple as not asserted, then there is going to be a rather nasty traffic jam between RDF and anything 'on top' of it. Labels can also be used by the other engines, of course, and they will probably want to use them for all kinds of purposes. > > Seems to me that all this can be done in one fairly simple way, by > allowing the subject and object of RDF triples to themselves be RDF > triples (not reifications of triples, but actual triples.) These > 'inner' triples are not asserted, and the 'verb' of the triple that > points to them provides the needed labelling. > >...and the needed contextual information. Provided you have enough sense to only go along the subsequent list by starting at the 'top'; but we cannot make that assumption at the RDF level, I suspect. > > The distinction between > subject and object provides the distinction between subnesting and > nesting, much in the way that LISP uses CDRs to indicate list members > and encodes sublists in the CAR. > >Do you really want to differentiate subject and object? I thought you >were going to say that it's the distinction between an atom (a >resource) and a subtriple that indicated nesting. > > Writing triples as [V s o], (I know this isnt the usual way round) > the following piece of KIF > (forall (?x)(implies (R ?x)(Q ?x a b))) > might look like this: > [Kif:forall ?x [Kif:implies [R ?x .] [Q ?x [etc a b]] > >I think we still have some work to do here. By casually changing the >order of subject-verb-object, I am taking the oft-repeated claim, that 'true' RDF is the triples graph and the SVO XML syntax is simply one possible lexicalization, very seriously here. :-) > you've obscured the fact that this >expression really looks like this: > > <?x, Kif:forall, <<?x, R, .>, Kif:implies, <?x, Q, <a, etc, b>>>> > >But a vanilla RDF processor might ask what ?x refers to. You try to >ward off this possibility: > > Ive inserted the "Kif:" in the spirit that it would be up to > something 'on top' of RDF to actually interpret these nestlings; > RDF's job is just to not think that they belong to it and are > actually being asserted. The topmost triple of the nesting *is* > being asserted, but since its verb starts with "Kif:", RDF is warned > not to try to do anything with it, and so it would not attempt to > interpret it as a relation called "forall" applied to two arguments. > >but as RDF stands now it would have to assume at least that there were >two entities being talked about, ?x and <<?x ...> ...>. It's not at >all clear what entity ?x might refer to. Well, in fact it would be quite OK to have it refer to itself, in this case. But I agree, I am rather fudging this issue, and assuming a slightly more-than-vanilla processor; but then this issue is already often fudged in practice, particularly by the more enthusiastic RDF users. Let's draw a distinction between what RDF would have to assume and what the RDF spec says it ought to be assuming. I agree that the spec would need to be reworded somewhat to make this legal, but I think that it would be a relatively small and painless change. It is rather like importing a smidgeon of rdfs into rdf. Another way to go would be to require that the 'top' (asserted) triple of any nest must have a special RDF-reserved label which marks it as being a nest or a context. I thought about this way of doing it, and we could go that way. Its uglier, IMHO, but it has the merit of making a sharp distinction between the triples that belong to RDF's logic and those that don't. However, there is a further step that I think we do not want to take, which would be to say that ANY assertion needs a top-level 'asserting' nest around it. That would really break the current RDF logical interpetation, rather than extend it. > where I have inserted a dummy dot to fill out the unwanted object of > the inner triple; these could be omitted by convention, of course. > 'etc' means a continuation of whatever structure it occurs in, in > this case a relational sentence with more than two arguments. Again, > it would be natural to allow things like [... a b c d] as an > abbreviation for [....a [etc b [etc c d]]]. > >I have a slight preference for having it abbreviate >[... a [etc b [etc c [etc d nil]]]] >but it doesn't make much difference. It would if d could itself be an >etc, but that's not possible, if I understand etc correctly. Right, not possible. That was how I did it originally, being an ex-LISPer like yourself, but my way has the merit of leaving [R a b] alone, rather than having to write [R a [etc b nil]], and it will work just as well, I think. One has to get used to a slightly different style of writing recursions, is all. Notice you wouldnt write .... c [etc d .] since that would just be [...c d], so in this style there is *never* a marker at the end of a list. Pat --------------------------------------------------------------------- IHMC (850)434 8903 home 40 South Alcaniz St. (850)202 4416 office Pensacola, FL 32501 (850)202 4440 fax phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Wednesday, 6 June 2001 16:44:29 UTC