Re: RDF as a syntax for OWL (was Re: same-syntax extensions to RDF)

hi Bijan,

Bijan Parsia wrote:

> Jeen Broekstra <jeen@aduna.biz>
> 
>> Peter F. Patel-Schneider wrote:
>>
>>>  I am trying to determine just what is being demonstrated here.
>>>
>>>  Are you saying that you find it easy to build a complete parser
>>>  (i.e., translator) for OWL in RDF/XML?  Are you saying that you
>>>  find it easy to build an incomplete parser for OWL in RDF/XML?  Are
>>>  you saying that you find it easy to build a species validator for
>>>  OWL in RDF/XML?
>>
>>
>> I am somewhat hesitant to enter into this debate as I personally have
>> not as much experience with handling OWL data as you and Bijan
>> probably have, but the one thing that haunts me in this is that you
>> both seem to insist on *parsing* OWL in RDF/XML.
> 
> That's a false seeming.

[snip quotations]

> Clearly, I'm expecting the author in the second case to use a 
> triplestore or to parse to triples first. Not too would be too horrible 
> for screeds.

Fair enough. Peter indicated in his answer that what I proposed was more 
or less along the lines he actually implemented stuff, as well.

> Also, if you read my posts, you'll see parsing is just the *simplest and 
> easiest to discuss* example. It's no means exhaustive. It's by no means 
> insurmountable. But I submit that if you make *parsing* and *syntax 
> checking* hard, indeed, if you force people to give up ALL THE TOOLS 
> THERE ARE and have to write a bunch of code or tool chains from scratch, 
> all to accomplish what would be *trivial* otherwise, that you've shown 
> that RDF data is *not* "nice to work with". Certainly not for these tasks.

Then perhaps my problem with all this is that you seem to think that 
somehow users/ontology builders/application developers should solve 
these problems (and I agree, for them it is "not nice"). My impression 
was very much that for the kind of tasks your are talking about, 
users/ontology builders/application developers expect *ready made tools*.

The problem in my opinion is therefore not that the RDF model sucks for 
these tasks, but that there are no tools available to do it for you 
(actually, for at least some of the things you mentioned, these tools 
_are_ actually available of course).

>> And I can't help but
>> wonder if abandoning this approach for validation in favor of
>> *querying* (or using rules, for all I care) for 'well-formedness'
>> would make everything a bit easier.
> 
> 
> No, actually, as Peter showed.

I haven't actually seen that yet. But I'm willing to believe that it is 
still quite difficult.

> But put it another way, do you not refute yourself? You want to parse to 
> a store then *query* that store (multiple times; with code inbetween) 
> when you could have used a simple declarative grammar or transparent, 
> easy to understand code?

I do not see a contradiction here. Applying an RDF toolkit in this 
fashion is not rocket science. Taking the Sesame library as an example: 
creating a repository in-memory and uploading the triples to it is about 
5 lines of code. The actual query is 1 line. Processing the result is of 
course task-dependent, but not insurmountable.

And I feel you're comparing apples and oranges. In using a declarative 
grammar you are applying a ready-made tool for that specific purpose: 
XML Schema validation only works if you apply a (duh...) XML Schema 
Validator toolkit. Validating an OWL ontology will require using an OWL 
validator. I'm not arguing that simply doing queries is enough; I'm 
arguing that using such tools for implementing a validator that others 
can use to actually validate is not unreasonable.

>> Perhaps I don't understand the parameters of the task at hand too
>> well, but I also have the feeling that perhaps you are not applying
>> the right tools to the job. Your quoted figures for an OWL parser from
>> RDF/XML seem to assume a DOM structure
> 
> 
> ? I would love to see a quote justifying this seeming. I think you see 
> straw where there is, in fact, iron.

I'm not following what you mean by this. My remark was aimed at Peter's 
email in which he presented an OWL parser. The RDF/XML version of that 
used a DOM structure, if I understood correctly.

>> but no RDF toolkit, i.e. you
>> are directly trying to construct OWL from the XML syntax (you mention
>> a 'nice internal data structure' but a graph data structure is not the
>> same as an RDF toolkit).
> 
> 
> I was not able to find this quote.

 From Peter's message: "In summary, taking an RDF graph (totally parsed 
and in a nice internal data structure) [...]". Last paragraph.

> You're telling me that I should have an RDF toolkit, including a query 
> engine, to *parse and validate* what is, in the end, a fairly trivial 
> syntax?
> 
> I'm sorry, that just sounds *insane*. How is this making life *easier*.

It makes it easier in that this gives you direct control over the actual 
graph structure. I do not see why this should be insane.

Of course, the alternative is that you use an alternative representation 
of OWL and manipulate that (with whichever tools are good at 
manipulating that particular representation). Fine with me. But if you 
do this at the triple level, it does not seem unreasonable to me to use 
an RDF framework, which gives you all sorts of nice utilities and query 
languages to manipulate the triple set with.

>> My hunch is that actually _using_ the triples through an RDF API/query
>> language instead of trying to bypass it will make life easier (and no,
>> I'm not claiming that it is trivial or very easy, I merely have the
>> impression that it is not as fiendishly difficult as you make it out
>> to be).
> 
> 
> I'm sorry, you're wrong.

That is always possible, of course. Though I believe that you 
overestimate the number of use cases for which this holds.

> It's not impossible, of course. It's just much nastier than the 
> alternative.

Fine. Then use the alternative. Noone is _forcing_ you to use triples; 
there are other representations for OWL, and they are all interchangable.

> My first attempt was using SWI Prolog and DCGs. It cannot be sanely done 
> in normal DCG style, as far as I can see. You have to maintain tons of 
> state. You are tempted to plop queries in curly braces and then you 
> realize you've completely subverted the formalism! COMPLETELY! And then 
> you are afraid all your prolog friends will think you touched in the head.
> 
>> To take Bijan's example of checking that a class expression such as:
>>
>>      <owl:Restriction>
>>          <owl:onProperty rdf:resource="P"/>
>>          <owl:someValuesFrom rdf:resource="C"/>
>>      </owl:restriction>
>>
>> is 'well-formed', i.e. is exactly formulated as such and has no extra
>> or missing triples, is simply a matter of doing some queries.
>>
>> construct distinct *
> 
> Construct constructs. So this is a cheat. Peter also pointed this out.

I don't understand why this would be a cheat. It is a valid RDF query 
that can be used in an RDF toolkit. Why is this a cheat?

> Plus, I've never seen construct distinct before. It's hardly widespread. 
 > I seriously doubt that there is a production system available using it
 > that's been remotely narrowly, much less widely, deployed.

Oh I don't know.

Or actually I do know. Sesame implements it. Has done so for about a 
year now (the CONSTRUCT clause was originally introduced in the SeRQL 
query language back then). I haven't counted the users who use this 
particular construction of course, but I've done a fair number of 
projects myself in which this is applied (mainly for graph 
transformations and such).

That aside, the notion of CONSTRUCT DISTINCT in this context is hardly 
the point, it was just an example.

>> from {R} rdf:type {owl:Restriction};
>>           owl:onProperty {Prop};
>>           owl:someValuesFrom {Val}
>>
>> retrieves a subgraph that you can check (using any RDF toolkit's
>> utility methods) quite easily for the existence/omission of triples.
> 
> This doesn't do the job. And if it did, it would still brutally suck 
> next to a schema.

You seem to forget that even a schema has to be implemented before it 
works.

> For example, what about error reporting? How about plugging into to an 
> editor and enforcing correctness or autocompletion? You have to build, 
> likely yourself, an entire infrastructure that doesn't work with 
> anything else. Why?

This is nonsense. If such tools become more freely available then _that_ 
is what you use. For XML validation no-one implements his own XML Schema 
interpreter right? Of course not, there are tools for that. Same for 
OWL. How are triples an issue here?

> Remember, we're not talking about the possible, we're talking about the 
> pleasant.

If your point is that reinventing this tool wheel every time is a pain, 
then yes, of course, I agree.

>> Granted, many query languages in the current set of RDF tools perhaps
>> still miss the expressiveness to make this as painless as it might be
>> (I'm thinking of explicit support for containers and collections,
>> here, which many tools still miss, and aggregation functions such as
>> count(), min(), etc.), but I still have the feeling this would be a
>> good approach.
> 
> 
> I respectfully submit that your feeling is totally wrong. Please, just 
> examine some of the code. It's *all* open source. It's all *easy* to 
> find. Why on earth are you speculating like this?

I'm speculating because I do not have the time to look at this code. I 
do know a little bit about OWL however, and I do know quite a bit about 
RDF frameworks and query languages, and what they can (and cannot) do.

>> If you have experience to the contrary, it would be interesting to
>> learn at what point you found the RDF toolkit/API/query language that
>> you worked with lacking.
> 
> 
> It's the wrong wrong wrong tool for the job. I wouldn't use a relational 
> database to parse C. Would you? *Why*? Why would you *even consider it*? 

Apples and oranges. I'm sorry but this metaphor just does not apply. 
Your complaint is that you can't manipulate OWL ontologies through RDF 
triples because it is a pain. I submitted that perhaps if you made 
better use of the capabilities of RDF frameworks it would be less of a 
pain. Regardless of whether that particular assertion turns out to be 
true or not, the link between triples and RDF frameworks should be 
obvious. I see no obvious link between the relational data model and C 
parsing.

> (There is the introspector project, but it has slightly different aims, 
> and I still think it's misguided. See: 
> http://introspector.sourceforge.net/)

Thanks, I'll have a look at that.

> And remember, parsing is only the start! What sandro wants is 
> impossible! What we got with owl was *super hard* (where there is an 
> much simpler alternative).

Ah. I jumped into the middle of this discussion, and have not actually 
considered Sandro's proposals (I was merely triggered by some of what 
you and Peter were saying). Sorry if that has lead to a topic drift.

> Plus, remember, your team doesn't *get to use* bigger structures!   So you
> *can't* parse to some nice internal, OWL like structure (see owl api, 
> KRSS, logic toolkits), and then do your manipulations!

Why on earth not? That's what these tools are for! What exactly are you 
trying to prove then with this excercise.

> So my negation 
> normal form challenge stands. You must read from a triplestore and write 
> to a triplestore. You must handle aribtrary OWL Class expressions. For 
> the record, this is typically no more than a few dozen lines of code. 
> But I predict that it will be *nasty*.
> 
> For those who don't know what negation normal form is, well, first, I 
> believe you've demonstrated sufficient lack of experience building 
> semantic web tools that you start off in a bit of a hole

Umm... Right.

> , and second, 
> it's very simple.   Remember that OWL (Lite too! Just we decided to
> torment generations by eliminating the explicit constructors!) has 
> negation. So lets take a very simple transformation, double negation:
> 
>     (not (not C) <=> C
> 
> With negation normal form, you drive *ALL* the negations as "deep" into 
> the formula (note how this metaphor loses its force with a triple 
> approach :() as they can go, so that the only negations appear on class 
> names. So, things like
>     (not (and C D))
> become
>     (or (not C) (not D))
> 
> And so forth.
> 
> Hey, you don't have to write the function! Just explain as much as I 
> explained using triples alone, whatever your favorite syntax.
> 
> (And remember! you don't get to consider expressions in isolation like 
> that! After all, there could be a link!)
> 
> I hope this clarified things for you.

Not quite. Your excercise seems contrived to me; I'm not quite sure what 
you are out to prove. NNF converters are possible, but why would you 
want to do this at the triple level, and without using any of the tools 
that are _designed_ to work on top of that?

Jeen

Received on Wednesday, 5 January 2005 18:13:14 UTC