Re: RDF as a syntax for OWL (was Re: same-syntax extensions to RDF)

(To, I'm sure, the distress of all, I've subscribed to this list. So 
maybe my posts will start ending up well threaded :))

Hi Jeen!

Jeen Broekstra <jeen@aduna.biz>
> Peter F. Patel-Schneider wrote:
>
>>  I am trying to determine just what is being demonstrated here.
>>
>>  Are you saying that you find it easy to build a complete parser
>>  (i.e., translator) for OWL in RDF/XML?  Are you saying that you
>>  find it easy to build an incomplete parser for OWL in RDF/XML?  Are
>>  you saying that you find it easy to build a species validator for
>>  OWL in RDF/XML?
>
> I am somewhat hesitant to enter into this debate as I personally have
> not as much experience with handling OWL data as you and Bijan
> probably have, but the one thing that haunts me in this is that you
> both seem to insist on *parsing* OWL in RDF/XML.

That's a false seeming.

I quote myself: 
http://lists.w3.org/Archives/Public/www-rdf-logic/2005Jan/0006.html

"(Note that I've not even touched how painful it is for people I've
taught. We almost always end up falling back on standard logic syntax.
This is not Turtle vs. RDF/XML...it's not the awful xml serialization
alone, it's the relentless triplization.)"

And: http://lists.w3.org/Archives/Public/www-rdf-logic/2005Jan/0010.html

(Wow, I could almost quote the whole thing. But two choice bits.)

"So, write a species validator. Write a converter to and from the
abstract syntax (note which way is *much much much* easier!!!) Propose
an encoding of N3 in OWL Full that is a same-syntax semantic extension,
with model theory."

I have knowledge of at least three species validators, with a pretty 
good idea of how they were written (including contributing to one, 
seeing the code of another, and having read two or three discussions of 
the third). All of them generated triples first.

And:

"The two functions would be:

         def nnf(triplestore) #returning triplestore

and
         def nnf(term) #returning term

or, better,
         def nnf(RDF/XML) #returning RDF/XML but using a triplestore"

Clearly, I'm expecting the author in the second case to use a 
triplestore or to parse to triples first. Not too would be too horrible 
for screeds.

Also, if you read my posts, you'll see parsing is just the *simplest 
and easiest to discuss* example. It's no means exhaustive. It's by no 
means insurmountable. But I submit that if you make *parsing* and 
*syntax checking* hard, indeed, if you force people to give up ALL THE 
TOOLS THERE ARE and have to write a bunch of code or tool chains from 
scratch, all to accomplish what would be *trivial* otherwise, that 
you've shown that RDF data is *not* "nice to work with". Certainly not 
for these tasks.

I never dreamed that it was the purpose of the semantic web to make W3C 
XML schema look good.

> And I can't help but
> wonder if abandoning this approach for validation in favor of
> *querying* (or using rules, for all I care) for 'well-formedness'
> would make everything a bit easier.

No, actually, as Peter showed.

But put it another way, do you not refute yourself? You want to parse 
to a store then *query* that store (multiple times; with code 
inbetween) when you could have used a simple declarative grammar or 
transparent, easy to understand code?

> Perhaps I don't understand the parameters of the task at hand too
> well, but I also have the feeling that perhaps you are not applying
> the right tools to the job. Your quoted figures for an OWL parser from
> RDF/XML seem to assume a DOM structure

? I would love to see a quote justifying this seeming. I think you see 
straw where there is, in fact, iron.

> but no RDF toolkit, i.e. you
> are directly trying to construct OWL from the XML syntax (you mention
> a 'nice internal data structure' but a graph data structure is not the
> same as an RDF toolkit).

I was not able to find this quote.

You're telling me that I should have an RDF toolkit, including a query 
engine, to *parse and validate* what is, in the end, a fairly trivial 
syntax?

I'm sorry, that just sounds *insane*. How is this making life *easier*.

> My hunch is that actually _using_ the triples through an RDF API/query
> language instead of trying to bypass it will make life easier (and no,
> I'm not claiming that it is trivial or very easy, I merely have the
> impression that it is not as fiendishly difficult as you make it out
> to be).

I'm sorry, you're wrong.

It's not impossible, of course. It's just much nastier than the 
alternative.

My first attempt was using SWI Prolog and DCGs. It cannot be sanely 
done in normal DCG style, as far as I can see. You have to maintain 
tons of state. You are tempted to plop queries in curly braces and then 
you realize you've completely subverted the formalism! COMPLETELY! And 
then you are afraid all your prolog friends will think you touched in 
the head.

> To take Bijan's example of checking that a class expression such as:
>
>      <owl:Restriction>
>          <owl:onProperty rdf:resource="P"/>
>          <owl:someValuesFrom rdf:resource="C"/>
>      </owl:restriction>
>
> is 'well-formed', i.e. is exactly formulated as such and has no extra
> or missing triples, is simply a matter of doing some queries.
>
> construct distinct *

Construct constructs. So this is a cheat. Peter also pointed this out.

Plus, I've never seen construct distinct before. It's hardly 
widespread. I seriously doubt that there is a production system 
available using it that's been remotely narrowly, much less widely, 
deployed.

> from {R} rdf:type {owl:Restriction};
>           owl:onProperty {Prop};
>           owl:someValuesFrom {Val}
>
> retrieves a subgraph that you can check (using any RDF toolkit's
> utility methods) quite easily for the existence/omission of triples.

This doesn't do the job. And if it did, it would still brutally suck 
next to a schema.

For example, what about error reporting? How about plugging into to an 
editor and enforcing correctness or autocompletion? You have to build, 
likely yourself, an entire infrastructure that doesn't work with 
anything else. Why?

Remember, we're not talking about the possible, we're talking about the 
pleasant.

> Granted, many query languages in the current set of RDF tools perhaps
> still miss the expressiveness to make this as painless as it might be
> (I'm thinking of explicit support for containers and collections,
> here, which many tools still miss, and aggregation functions such as
> count(), min(), etc.), but I still have the feeling this would be a
> good approach.

I respectfully submit that your feeling is totally wrong. Please, just 
examine some of the code. It's *all* open source. It's all *easy* to 
find. Why on earth are you speculating like this?

> If you have experience to the contrary, it would be interesting to
> learn at what point you found the RDF toolkit/API/query language that
> you worked with lacking.

It's the wrong wrong wrong tool for the job. I wouldn't use a 
relational database to parse C. Would you? *Why*? Why would you *even 
consider it*? (There is the introspector project, but it has slightly 
different aims, and I still think it's misguided. See: 
http://introspector.sourceforge.net/)

And remember, parsing is only the start! What sandro wants is 
impossible! What we got with owl was *super hard* (where there is an 
much simpler alternative).

Plus, remember, your team doesn't *get to use* bigger structures! So 
you *can't* parse to some nice internal, OWL like structure (see owl 
api, KRSS, logic toolkits), and then do your manipulations! So my 
negation normal form challenge stands. You must read from a triplestore 
and write to a triplestore. You must handle aribtrary OWL Class 
expressions. For the record, this is typically no more than a few dozen 
lines of code. But I predict that it will be *nasty*.

For those who don't know what negation normal form is, well, first, I 
believe you've demonstrated sufficient lack of experience building 
semantic web tools that you start off in a bit of a hole, and second, 
it's very simple. Remember that OWL (Lite too! Just we decided to 
torment generations by eliminating the explicit constructors!) has 
negation. So lets take a very simple transformation, double negation:

	(not (not C) <=> C

With negation normal form, you drive *ALL* the negations as "deep" into 
the formula (note how this metaphor loses its force with a triple 
approach :() as they can go, so that the only negations appear on class 
names. So, things like
	(not (and C D))
become
	(or (not C) (not D))

And so forth.

Hey, you don't have to write the function! Just explain as much as I 
explained using triples alone, whatever your favorite syntax.

(And remember! you don't get to consider expressions in isolation like 
that! After all, there could be a link!)

I hope this clarified things for you.

Cheers,
Bijan Parsia.

Received on Wednesday, 5 January 2005 14:59:53 UTC