Re: Converting SHOE to RDF: about 2/3 done; some gotchas from Sean Luke on 2000-05-13 (www-rdf-interest@w3.org from May 2000)

From: Sean Luke <seanl@cs.umd.edu>
Date: Sat, 13 May 2000 14:10:42 -0400 (EDT)
To: Stefan Decker <stefan@db.stanford.edu>
cc: www-rdf-interest@w3.org
Message-ID: <Pine.GSO.4.21.0005131312080.1589-100000@jifsan.cs.umd.edu>
Hey, Stefan!  I'm not going to be able to respond again -- gotta do some
heavy duty writing.  But here's some stuff.


On Sat, 13 May 2000, Stefan Decker wrote:

> But i want to define links between knowledge pieces and point to
> different servers on the web. I have to deal with glitches then, but i
> also have to deal with the same glitches now (eg. when i encounter a
> broken link in a HTML page).

Well, sure.  But if the correct understanding of the *semantics* of your
links depends on other web pages' statements, then when these pages go
down, you're in trouble.

It's one thing to say that pages will change, and we that have to live
with this.  It's another thing to use this as an argument for tying the
correct understanding of *my* data on the
correctness/availability/completeness of *your* data, especially when I
have no control over you and must merely "trust" you.  This is
unneccessary and we *don't* have to live with that.

As we've recently seen with the Microsoft/ILOVEYOU debacle, a trust-only
model is very weak.  Now in some situations a trust model, bad as it is,
is the only way to go.  But the distributed semantics situation is not one
of them.  RDF shouldn't have gone this route, and it's gonna bite us in
the rear in the future IMHO.

 
> >SHOE's model counters this in two ways, which Jeff is getting at I think.
> >First, it separates schema from data, which at least attempts to provide
> >*some* semblence of authority, or at least a well-defined semantic
> >language so we're very clear on what I intend to be inferred when I say
> >thing Foo.
> 
> But is actually also possible for RDF. An RDF-Schema (or Sergeys
> UML-RDFSchema) can define any language for RDF. One is free to adopt a
> formal semantics for the language if this is useful - this is actually
> the way currently OIL goes (see http://www.ontoknowlege.org/OIL ). So
> you have the same separation of schema and data - with the freedom to
> define any Knowledge Representation Language on top of RDF - and i
> don't just count logic based formalisms as Knowledge Representation
> Languages.       ^^^^^^^^^^^^^^^^^^^^^^

Wow, that's really raising the bar high.  [In some sense this can be
parsed as "I don't count computationally tractable things as KR languages"  
:-) :-)]

Here's the distinction, I think.  Clearly when you make a statement in
RDF, you're making it with regard to a set of namespaces so that the
system can interpret it with regard to the symbols defined in those
namespaces.  But what RDF doesn't state is a contractual guarantee of some
sorts that when you say something with regard to a set of namespaces, that
the semantic meaning of your statements is limited to what can be inferred
directly from those namespaces alone.

Now, SHOE suffers from this problem in a different aspect.  While it
permits this guarantee with regard to RDF-like inferential stuff like
subcategorization, another schema might come along with more sophisticated
inferences (horn clauses) and say all sorts of new inferred stuff about
your web page -- but an agent contractually understands that you didn't
mean this stuff to be inferred (though it might do so anyway if it trusts
the second schema).

Trust is an issue here -- the way SHOE is set up, schema are totally
different beasts from data, and the intention is that while data may be
legion, schema are generally fewer in number, versioned, and stable
(they're not even locked in with URIs).  The idea is that while it's fine
to say whatever you want, it's best if we agreed on some basic central
rules about how to say it (whether this will in fact come to pass is still
up in the air :-).  But in RDF, schema and data are one and the same --
which I think tends to promote the idea of sticking minischema right in
with your data as you like (as was done in Jeff's example).  This is going
to create a much more fine-grained distribution of schema, which doesn't
feel like a great idea to me as I've mentioned before.


> What about Petri Nets or finite state machines (see again Sergeys UML
> model) (which are used to represent dynamic knowledge in a declarative
> way)? How would a Petri Net (and instances of Petri Nets) be defined
> in SHOE?

SHOE is at the level of RDF+RDFSchema.  I can't think of many semantic
features of RDF+RDFSchema that can't be expressed in SHOE.  But I'm not
arguing that RDF+RDFSchema isn't flexible enough (except for a lack of
n-ary relations and general inference, but...).  I'm arguing that it's
dangerously *too* flexible, at least with regard to treating schema as
first-class objects in a distributed environment.


> (RDF is not a Knowledge Representation system at all.
> RDF just provides a lightweight object model to define other
> languages on top of RDF - not necessarily typical KR languages.)

Since you've already set the bar so high :-), I guess we can't agree on
this point.  But my feeling on this is that if a language defines the
ability to state inferential rules of any kind, then it has stepped out of
the database realm and into the KR realm.  RDF+RDFSchema's subPropertyOf
and subClassOf features, by this definition, qualify it as a KR language,
however simple.

 
>  >stored internally and so can interpret statements in its system under the
>  >semantics of the full universe of facts at its disposal.  But I think that
>  >RDF does not take into consideration the uncomfortable fact that the web
>  >is not a sandbox.  People say anything they want, even trusted ones.  And
>  >they can stop saying these things at any time, and for any reason.
>  >Relying on the availability, correctness, and completeness of others'
>  >statements in order to put your own into context is pretty dangerous.

> Isn't it equally  as dangerous to a link to another HTML-page from
> your own page, since you can't guarantee that the other HTML-page will
> be available an hour late?

I don't think so.  Because if the remote HTML page is missing, the remote
agent *knows* this.  In this sense, HTML may not be complete, but at least
it's sound.  In RDF it's quite easy to create stuff that assumes reliance
on remote pages for semantics, but the agent _doesn't_ know this.  Put
another way, in a distributed realm it'd be tough to make a language that
had a guarantee of _completeness_ -- but RDF goes further: it's not even
sound.  This is unnecessary and unfortunate.

Sean
Received on Saturday, 13 May 2000 14:10:43 UTC