Re: RDF as a syntax for OWL (was Re: same-syntax extensions to RDF)

Couldn't resist and had few nightly hours left :)
*If* you want nnf to be your problem for a while
then one can write in N3 such triples as


{?C a :BaseClass} => {?C a :GivenClass}.
{?C a :BaseClass} => {?C :isRewrittenAs ?C}.
{(:tilde (:tilde ?C)) a :GivenClass. ?C a :GivenClass. ?C :isRewrittenAs 
?CC} => {(:tilde (:tilde ?C)) :isRewrittenAs ?CC}.
{(:tilde (:and ?C ?D)) a :GivenClass. ?C a :GivenClass. ?D a :GivenClass. 
?C :isRewrittenAs ?CC. ?D :isRewrittenAs ?DD} => {(:tilde (:and ?C ?D)) 
:isRewrittenAs (:or (:tilde ?CC) (:tilde ?DD))}.
{(:tilde (:or ?C ?D)) a :GivenClass. ?C a :GivenClass. ?D a :GivenClass. 
?C :isRewrittenAs ?CC. ?D :isRewrittenAs ?DD} => {(:tilde (:or ?C ?D)) 
:isRewrittenAs (:and (:tilde ?CC) (:tilde ?DD))}.
{(tilde (:some ?P ?C)) a :Class. ?C a :GivenClass. ?C :isRewrittenAs ?CC} 
=> {(tilde (:some (?P ?C))) :isRewrittenAs (:all (?P :tilde (?CC)))}.
{(tilde (:all ?P ?C)) a :Class. ?C a :GivenClass. ?C :isRewrittenAs ?CC} 
=> {(tilde (:all (?P ?C))) :isRewrittenAs (:some (?P :tilde (?CC)))}.

and then given

:A a :BaseClass.
:B a :BaseClass.
(:tilde (:and :A :B)) a :GivenClass.
(:tilde (:tilde (:tilde (:and :A :B)))) a :GivenClass.

one can ask

?C :isRewrittenAs ?D.

and get

# Generated with http://www.agfa.com/w3c/euler/ version R4104 on 7 Jan 
2005 01:25:51 GMT
@prefix log: <http://www.w3.org/2000/10/swap/log#>.

(<file:/temp/bijan.n3>.log:semantics).log:conjunction =>
{
@prefix q: <http://www.w3.org/2004/ql#>.
@prefix e: <http://www.agfa.com/w3c/euler/log-rules#>.
@prefix xsd: <http://www.w3.org/2001/XMLSchema#>.
@prefix : <http://www.agfa.com/w3c/temp/bijan#>.
@prefix log: <http://www.w3.org/2000/10/swap/log#>.
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>.
@prefix math: <http://www.w3.org/2000/10/swap/math#>.
@prefix owl: <http://www.w3.org/2002/07/owl#>.
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>.

{{:A a :BaseClass} e:evidence <file:/temp/bijan.n3#_48>} => {
{:A :isRewrittenAs :A} e:evidence <file:/temp/bijan.n3#_41>}. 

{{:B a :BaseClass} e:evidence <file:/temp/bijan.n3#_49>} => {
{:B :isRewrittenAs :B} e:evidence <file:/temp/bijan.n3#_41>}. 

{{(:tilde (:tilde (:tilde (:and :A :B)))) a :GivenClass} e:evidence 
<file:/temp/bijan.n3#_51>. 
 {(:tilde (:and :A :B)) a :GivenClass} e:evidence 
<file:/temp/bijan.n3#_50>. 
 {{(:tilde (:and :A :B)) a :GivenClass} e:evidence 
<file:/temp/bijan.n3#_50>. 
  {{:A a :BaseClass} e:evidence <file:/temp/bijan.n3#_48>} => {
  {:A a :GivenClass} e:evidence <file:/temp/bijan.n3#_40>}. 
  {{:B a :BaseClass} e:evidence <file:/temp/bijan.n3#_49>} => {
  {:B a :GivenClass} e:evidence <file:/temp/bijan.n3#_40>}. 
  {{:A a :BaseClass} e:evidence <file:/temp/bijan.n3#_48>} => {
  {:A :isRewrittenAs :A} e:evidence <file:/temp/bijan.n3#_41>}. 
  {{:B a :BaseClass} e:evidence <file:/temp/bijan.n3#_49>} => {
  {:B :isRewrittenAs :B} e:evidence <file:/temp/bijan.n3#_41>}} => {
 {(:tilde (:and :A :B)) :isRewrittenAs (:or (:tilde :A) (:tilde :B))} 
e:evidence <file:/temp/bijan.n3#_43>}} => {
{(:tilde (:tilde (:tilde (:and :A :B)))) :isRewrittenAs (:or (:tilde :A) 
(:tilde :B))} e:evidence <file:/temp/bijan.n3#_42>}. 

{{(:tilde (:and :A :B)) a :GivenClass} e:evidence 
<file:/temp/bijan.n3#_50>. 
 {{:A a :BaseClass} e:evidence <file:/temp/bijan.n3#_48>} => {
 {:A a :GivenClass} e:evidence <file:/temp/bijan.n3#_40>}. 
 {{:B a :BaseClass} e:evidence <file:/temp/bijan.n3#_49>} => {
 {:B a :GivenClass} e:evidence <file:/temp/bijan.n3#_40>}. 
 {{:A a :BaseClass} e:evidence <file:/temp/bijan.n3#_48>} => {
 {:A :isRewrittenAs :A} e:evidence <file:/temp/bijan.n3#_41>}. 
 {{:B a :BaseClass} e:evidence <file:/temp/bijan.n3#_49>} => {
 {:B :isRewrittenAs :B} e:evidence <file:/temp/bijan.n3#_41>}} => {
{(:tilde (:and :A :B)) :isRewrittenAs (:or (:tilde :A) (:tilde :B))} 
e:evidence <file:/temp/bijan.n3#_43>}. 

# Proof found for file:/temp/bijanC.n3 in 145 steps (14485 steps/sec) 
using 1 engine (11 triples)
}.

so the e.g. 3rd solution says ~~~(A&B) is rewritten as ~A v ~B
and all those () are actually rdf list triples


-- 
Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/




Bijan Parsia <bparsia@isr.umd.edu>
Sent by: www-rdf-logic-request@w3.org
06/01/2005 20:06

 
        To:     "Geoff Chappell" <geoff@sover.net>
        cc:     <www-rdf-logic@w3.org>, (bcc: Jos De_Roo/AMDUS/MOR/Agfa-NV/BE/BAYER)
        Subject:        Re: RDF as a syntax for OWL (was Re: same-syntax extensions to RDF)



Bravo! Good for you for taking on the challenge!

On Jan 6, 2005, at 12:07 PM, Geoff Chappell wrote:
[snip]
> Given the challenge, I had to give it a try in RDF Gateway's rule 
> language
> ;-) The results don't really rebut the ugliness claim, but do 
> demonstrate
> that it's doable in at least one of the available frameworks.

FWIW, I would never claim and haven't that it *couldn't* be done. It 
has been done before, and it's pretty clear how to do (as you 
demonstrate).

> BTW, I'm not
> denying this was a bit of a pain, nor in any way trying to be an 
> advocate
> for the forcing of fol into rdf syntax.

Granted.

> I ended up with rulebase below. With it I could convert a graph to nnf 
> form
> with a few lines - e.g:
>
>   var ds = new 
> datasource("inet?parsetype=auto&url=c:/kill/nnftest.rdf");
>
>   select ?p ?s ?o using #ds rulebase nnf where {[rdf:type] ?c 
> [owl:Class]}
> and nnf(?c ?p ?s ?o);
>
> The rules may not be 100% though I tested them with a decent number of 
> cases
> (but no pathological ones).

Sorry about the missing bits of the specification. It *was* 4 am or so 
:)

> rulebase nnf
> {
>                infer nnf(?cin, ?p, ?s, ?o) from nnf_pos(?cin, ?cout, ?p, 
?s, ?o);
>
>                infer nnf_pos(?cin, ?cout, ?p, ?s, ?o) from isAnon(?cin)

Of course, and we might consider this an extension to the task: the 
expression could be named! Even if not named, consider the 
semantics...you can't smush bnodes that identify equivalent expression 
or expressions of exactly the same form. Every time you see one of 
these, there might be other triples floating around :(

Aliasing problems are much worse in RDF, I warrant.

Notice how the RDF representation imports your expressions into your 
*modeling* domain. It's not merely that you can *introspect* your 
expressions, they are actually there in your domain! Kinda scary.

(Sorry, that was a bit of an aside.)

[snip as things are evolving a bit faster than I'm writing :)]

Suppose we end up with an nnf fucntion in this style that's complete 
and correct. Let's consider some of the effects of some of the choices.

First, for analysis of the program. The comparison class would be a 
program that uses a term like syntax (as I've been calling it). For 
example,
                 <not><not>http://foo.com/A</not></not>

Roughly the same as
                 ~~A where A is an atomic name.

So, in both cases we need an xml parser, say a sax one. In the term 
case, we can write the nnf directly on sax events (I'm pretty sure; 
don't see why not; maybe I'll do the exercise). So, our dependancies 
are done. We need nothing else. It'll also be pretty close to maximally 
efficient, depending on whether we use infix or pre/post fix, because 
we can avoid lookahead. It's also fairly trivial to add some simple 
syntactic validation along the way, by hand, if we'd like. A lot of the 
rdf legal but malformed constructions are just impossible in this 
syntax.

The syntax is usefully compositional. I can embed such expressions in 
larger forms and my corresponding code for dealing with those 
expressions could be (likely) called from other code. Imagine an XSLT 
stylesheet for nnf. If I extend the syntax to handle subClassOf (i.e., 
limited conditionals), I should be able to transform the left and right 
hand sides using pretty much my old stylesheet. Writing test cases is 
easy as is checking them.

If I want to separate syntax checking from my transform, I could whip 
up a W3C XML Schema or relax-ng schema. It's going to be fairly simple 
in this case. I can then use those schemas in a range of editors to 
assist me in generating correct documents.

In the Using A Big RDF Toolkit Oriented Case, my dependancies are much 
worse. I don't just need an RDF parser, but, according to this line of 
thinking, I need a Big Beefy RDF Toolkit with Query. These kits aren't 
small, and they aren't as ubiquitous (I'd much rather write just a sax 
parser, than both a sax and an RDF parser...though it's not *that* 
terrible to write an RDF parser, it's just a waste in this case!) Ok, 
so I *have* to load the entire document into memory before I can do 
anything else! (The last triple might be relevant to the first triple!) 
(Or into a disk based database, but I trust the obvious worseness of 
that is sufficient to rebut it). I either have to have indexed it a 
lot, or I must touch lots of triples with *each* query. If I want to be 
careful, I probably want to delete them as I consume them, and let's 
say I'll serialize or add to *another* triplestore as I go.

So, I've turned this into a two pass parser that uses queries over a 
store.

And testing! Whoa, bnodes galore. To be safe, I might want to use a 
graph equivalence test. Does my toolkit have it? No? Off to read 
Jeremey's paper. Partioning. Ugh. Bleah. Wow, this hurts.

Let's loosen the requirements of the problem so I can use *other* 
features of teh Big Beefy Toolkit, say, the OWL API of that toolkit. Oh 
wait, it's not clear that Sesame (http://www.openrdf.org:80/) or RDF 
Gateway 
(http://www.intellidimension.com/default.rsp?topic=/pages/site/ 
products/rdfgateway.rsp) *have* such apis! So if I use them, I'm back 
to writing my own triples to OWL Abstract syntax (or rather, an api 
loosely based on that) parser. I switch to Jena or the OWL API...at 
least they have that built in! But that parser is still there and it 
has a couple of choices. It can take the Big Query approach (but then 
why'd we switch?!?!?), or it can parse a stream of triples.

But note that we have to wait for it to process the entire file before 
we can try to nnf *anything* (that last triple!) And it has to keep all 
the objects it constructs "pending" until that last triple. (Actually, 
I don't know if any existing system actually does that casue it's 
pretty painful. Sean punted, I believe. We punted. Peter?)

Ok, we pay that price (remember, even if the converter is implemented, 
we still have to run it!), and now we're ready to write nff...which 
will at this point, we hope, if we're lucky, be pretty similar to the 
term like one. Except, if we've parsed to, oh, java objects, we don't 
have a nice external representation. Sigh. We could generate the xml 
version and then use the sax transformer...but wouldn't that have been 
easier to have done from the start?

Anyone care to do the doublenegation by hand in the RDF case? It's 
pretty horrible. In the term case it's perfectly reasonable. So if I'm 
trying to *explain* this to a developer or student, it's not hard at 
all in the one case, death on toast in the other.

Let me point out that parsing from the term representation to the 
triple one is likely to be *waaaaaaaay* easier (certianly not worse 
than a general RDF parser). So, should there be a task for which the 
tripley representation is better, I can get that pretty easily! So why 
is the tripley one the *interchange syntax*?!

And I reiterate, this is before considering the various semantic 
difficulties. I just presume I'm treating RDF as plain data. I'm 
assuming that my query language isn't trying to be sound and complete 
with respect to the RDF or RDFS or (especially) the owl semantics.

For the kind of heroics you  have to go through on the relentless 
triple approach, you should have correspondingly heroic benefits. What 
do I get back in trading off simpilcity, modularity, reusability, time, 
space, code size, and probably a few other things?

Argh, it's 4am again :) Stupid midnight telecons!

This is not graceful degradation in a corner case. This is falling flat 
on your face in an easy and easily generalizable case.

So, I've harped on this case. I do not believe this case an anomaly. 
Just take OWL. The pain you see here was much worse there. And it 
really is needless pain! Is the semantic web so successful that it can 
easily afford the waste of time, energy, and confusion we see here? 
Consider just the PR problem of telling XML people that in order to do 
*anything* they need an entirely new set of beefy tools.

Note that, in a sense, RDF/XML syntax is only the start of the pain. 
Triples, binary assertions, are just not the right tool for many jobs. 
Indeed, often they are a very wrong tool, a non tool, an anti tool. The 
sooner the semantic web community grasps this  (as a whole) the better 
off we'll be.

Cheers,
Bijan Parsia.

P.S., Kudos, again, to Geoff for taking action.

Received on Friday, 7 January 2005 01:36:53 UTC