- From: Sean Luke <seanl@cs.umd.edu>
- Date: Wed, 22 Dec 1999 18:15:36 -0500 (EST)
- To: www-rdf-interest@w3.org
Hi everyone. My name is Sean Luke, and I am a member of the SHOE team at University of Maryland. SHOE is an SGML- and XML-based knowledge-representation language with much the same design goals as RDF. Dan Brickley mentioned to me that there is some interest in getting RDF to do certain SHOE-like things (for example, n-ary relations and inferences), and has suggested that I post to this list with some current suggestions I have for RDF based on our research experience with SHOE. SHOE started about the same time as MCF; it predates RDF by a little bit. As such, many issues that RDF has dealt with recently have also been ones that the SHOE team grappled with a few years back. All things SHOE can be found at http://www.cs.umd.edu/projects/plus/SHOE/ So at Dan's suggestion, Jeff Heflin and I have spent a few days detailing various areas where we think RDF has weaknesses, some minor, some (IMHO) grievous, and where our design decisions might stir up some discussion about where RDF is going, if not make some positive contributions in its design. We've tallied a pretty big list of things to discuss (including a lot of little minor quibbles), but rather than just dump it all here I figure the right thing to do is post a topic or two at a time. I thought a good place to start would be with SHOE's biggest departure from RDF, namely its general inferential semantics. But there are also a lot of other areas we think RDF could be improved, including versioning, n-ary relations, abandoning a frame-based approach, adding data types, and providing a better namespace mechanism. If anyone's interested in those, I'd be glad to post our ideas on them and see what you think. Inferences and RDF ------------------ RDF's basic semantic design philosophy is different from SHOE's. RDF seems to follow in the footsteps of frame languages and semantic networks; SHOE used to be a semantic network as well, but now tends to follow relational and logical languages: it provides n-ary relations and a general inference mechanism. Many of RDF's bigger special-case warts (hard-coded "container" objects, infinite sets of relations for numbering in containers, "aboutEach", "subProperty", etc.) stem from its lack of any basic general-purpose inferential rule mechanism. With even a very simplistic Horn clause mechanism, *all* of RDF's present special-case semantics (except its aboutEachPrefix feature) can be dealt with trivially. So why is this a problem? I'd be willing to bet some serious money that as RDF progresses, the pressure to add more and more special-case warts to the language will grow, as users rapidly outgrow the present finite semantic usefulness of a language with no general-purpose inferential mechanism. RDF's semantics are presently pretty watered down, so much so that it is difficult (not impossible) to coherently argue why using RDF is better than just writing an application in XML for your schema and going with that. In the evolution of SHOE we realized that to do anything really useful with the language, especially in a diverse environment with multiple schema that will need to be mapped to one another, we needed at least some basic semantics. The trick was doing this without increasing computational complexity beyond usefulness. We think we've done a fair job, but we'd be grateful for your opinions on it. SHOE's present semantics are equivalent to Datalog without negation (not even stratified negation). No negation, no procedural attachment, no numerical operations, and only a single element in the consequent of the Horn clause. Here's a *simple* example of transitivity over membership in binary relations, which as best as I can tell is not describable in RDF. At any rate, if it is expressable in RDF, I'm sure we can come up with another that's not :-). Here we go: Suppose someone has designed a "university-organization" schema. In it, one is able to say that they are a "member" of a given organization. Organizations are further able to say that they are "suborganizations" or other organizations. Now, I work for the PLUS laboratory, which is a suborganization of the Advanced Information Technology Laboratory, which is a suborganization of both UMIACS and the Computer Science Department, which are both suborganizations of U Maryland, which is a suborganization of the State of Maryland. It seems an obviously useful thing for an agent to discover that I'm an employee of the State of Maryland without me having to explicitly say all of these employment ground facts; I should merely have to say that I work for the PLUS lab. The inferential statement is: member(?org2,?person) :- member(?org1,?person) ^ subOrganization(?org1,?org2) This is read as: "?person is a member of ?org2 if ?person is a member of ?org1 and ?org1 is a suborganization of ?org2". In a SHOE schema this looks like: <DEF-INFERENCE> <INF-IF> <RELATION NAME="member"> <ARG POS=1 VALUE="org1" USAGE=VAR /> <ARG POS=2 VALUE="per" USAGE=VAR /> </RELATION> <RELATION NAME="subOrganization"> <ARG POS=1 VALUE="org1" USAGE=VAR /> <ARG POS=2 VALUE="org2" USAGE=VAR /> </RELATION> </INF-IF> <INF-THEN> <RELATION NAME="member"> <ARG POS=1 VALUE="org2" USAGE=VAR /> <ARG POS=2 VALUE="per" USAGE=VAR /> </RELATION> </INF-THEN> </DEF-INFERENCE> Now, if I claim member(me,PLUS Lab), and the PLUS Lab had declared subOrganization(PLUS Lab, AITL), etc., then an agent should be able to easily determine that I work for the State of Maryland (or the CS Department, or whatever). Of course, RDF does provide one or two special-case versions of transitivity (subProperty and subClass). But there are lots of examples, like the one above, that don't work with these special cases very well. So there are of course a lot of places where inferencing comes in handy. Reducing necessary ground statements (RDF presently doesn't provide basic general-purpose transitive closure, inversion, or transfers-through relationships), setting up mappings between schemas and also among versions, and extending the semantic meaning involved when a web page claims some fact is true. The critical issue is of course how to implement such semantics without getting bogged down in a morass of computational complexity a-la KIF, given the potential amount of data out there. Since they are mappable to Datalog, SHOE's inferential semantics are at least polynomial if not better, and we believe that they are limited enough to deploy on a large scale, especially in domains where ontologies are relatively disjoint from one another. Still, I can think of some further restrictions can be applied that would enforce an even more efficient inferential approach. For example, a schema X might declare itself to belong one of three levels: one that does not provide inferences, one that disallows any inferences from outside schema including symbols declared in X, and one that permits full inferential capabilities. Revisions of schema (later versions) are permitted only to increase the semantic level or keep it the same, but not decrease it. Even more semantic expressivity might be added in this way (another level which permits stratified negation, for example). Also certain relations or simple (one-level, non-recursive) inferences might be declared Final, to indicate to agents that they should be simply flattened when gathered rather than inferred over and over. Certainly most if not all of RDF's current special-case inference mechanisms can be declared Final without a significant decrease in speed of data-gathering. Sean
Received on Wednesday, 22 December 1999 18:15:51 UTC