Re: owl.owl (was: OWL 2 -- Call for Implementations, new Drafts) from Bijan Parsia on 2009-07-22 (public-owl-dev@w3.org from July to September 2009)

From: Bijan Parsia <bparsia@cs.man.ac.uk>
Date: Wed, 22 Jul 2009 01:00:21 +0100
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-owl-dev@w3.org
Message-Id: <60E038E3-79A4-407E-A89C-F7AC71A68A9C@cs.man.ac.uk>
On 21 Jul 2009, at 23:22, Holger Knublauch wrote:

> Sorry for picking up this old thread. There have been some other  
> messages on the OWL comments mailing list (see [1]), and Bijan  
> suggested [2] I return to this list. This is mainly a response to  
> Bijan's message.
>
> The topic is whether there should be an owl.owl file with RDF  
> triples for the OWL (2) vocabularies, maintained by the OWL working  
> group. I suggest doing this, and I am glad that my request is being  
> (albeit, by some, reluctantly) moved forward.
>
>
> Bijan Parsia wrote:
> > If you follow the link I in:
> >	http://lists.w3.org/Archives/Public/public-owl-wg/2009Jul/0030.html
> > You'll see that the idea of importing SWRL.owl is exactly what I  
> argue
> > against, so it's a bit odd to appeal to it as an exemplar.
>
> I have been reading your message [3] but don't think it contains any  
> arguments apart from that importing SWRL would make ontologies  
> become OWL Full.

I don't mean that reference to point to all the arguments I would  
mobilize against this practice, just as an indicator that this is an  
opinion of long standing.

In that particular thread, you'll see that people who depend on a  
(static) OWL file at the w3c are rather stuck. This is, itself, a  
problem, esp. when

> The other arguments just seem to repeat that importing SWRL is a bad  
> idea because you don't think this is a good practice.

It's bad practice because (not exhaustive):
	1) It pollutes the user defined vocabulary with the logical  
vocabulary which has many negative ramifications: e.g., it clutters  
the interface (sometimes severely); it can break reasoners and other  
manipulators (which can, of course, be solved by recognizing and  
filtering them out, but, uh, that means special case code anyway);  
there can be interations between my domain modelling and the  
"language" modeling (even at simple levels, for example, do I want my  
ontology stats to include individuals or classes from owl.owl?)

	2) It technically makes every document OWL Full in fairly strong  
ways. Yeah, I think it's a problem that if I'm not, myself, using OWL  
Full functionality that I'm forced into that extra expressivity. I  
want to pay only for what I use (as a modeller)

	3) owl.owl cannot provide a reasonably complete description of even  
the syntax of OWL, even if it uses all of OWL Full. At least, not in  
any sane way; certainly not how it is now. OWL is just not tuned for  
describing syntax constraints (contrast with XML Schema). Nothing  
wrong with that, but right tool for the job, etc.
	People get confused about that. I don't think we should encourage  
that confusion.

	4) It conflates specification with implementation. This is a problem  
I have with GRDDL (and even with using schema langauges for  
specification, although that's harder to resist).

	5) It (tends to) confuse syntax with semantics. Which I think is bad  
for users and newbie implementors.

	6) Generic editing of such things tends to be very bad experience for  
users, especially in GUIs. Editing swrl statements as abox stuff is a  
freaking nightmare in a Protege like interface.

Fortunately, it doesn't seem to be popular anymore. I see far fewer  
people advocating it than I did back in early 2000s.

> Importing SWRL shares some of the advantages of importing the OWL  
> namespace. For example, users of editing tools can see the  
> definitions of the SWRL built-ins, use auto-complete, tool tip texts  
> and any other infrastructure that they will expect.

I accept that in your infrastructure you are set up so this works out  
ok. This makes owl.owl useful for you. But I imagine (and hope) that  
you are somewhat careful about built in vocabulary *anyway*.

For example, in swoop, at one point we just let people follow  
hyperlinks to the owl and rdf vocabulary. This triggered swoop to load  
those files. Almost always, that wasn't what was wanted (now there's  
three ontologies loaded when you really just are working on one). So  
we special cased all the builtin vocabulary to display help text.  
Human help text (derived from the owl reference).

Built-in vocabulary is special.

> Whether this makes the files OWL Full is from my point of view  
> completely irrelevant.

Because you already special case it in certain ways. Which is fine.  
But it doesn't make it fundamentally different from other  
implementation approaches.

But it's not just OWL Fullness. For example, having the vocabulary  
imported with axioms like would inherently demodularize the ontology.  
(Esp. for purely syntactic modularization algorithms.) Justifications  
would become way more complicated. Etc.

It's not just reasoning. And it's not just OWL Fullness per se. I can  
have a perfectly modular OWL Full file.

> If inference engines have problems with that, then they can happily  
> ignore those triples and imports.

So they have to special case it.

> This is easily supported by APIs such as Jena, because you just need  
> to remove a sub-Graph from a UnionGraph. But IMHO it is cleaner to  
> operate on well-defined terms that are backed by real URIs and  
> helpful background information like rdfs:labels, rdfs:comments etc.

Cleaner? I'm not sure how that's the case. But I guess this is an  
aesethetic judgement and our tastes can differ. But then this isn't a  
technical argument.

OWL is well defined and defined well by the specifications which, er,  
define it. owl.owl cannot *define* OWL.

RDF is well defined and defined well by the specs which define it.  
Even in RDF, the built-in vocabulary is treated specially (i.e., with  
additional semantic conditions).

Traditionally, e.g., in most logic, programming languages, what have  
you, orthogonality is valued and often what is appealed to when people  
say "cleanliness". owl.owl mishs things up. Sometimes, such mishing is  
considered neat (e.g., Lisp and Prolog's treatment of code as data and  
the ability to write metacircular interpreters), but these things are  
also tricky (it helps that with Lisp and Prolog they are actually  
expressive enough to define themselves; real implementations, even if  
bootstrapping, tend to separate things out).

As a user, I prefer the environment to have special knowledge of the  
language. Furthermore, I prefer that implementation details don't leak  
into my workaday situations. Obviously, by being a bit clever and with  
a bit of code, you can make this happen and still use owl.owl as you  
describe. But that's hardly the "for free" you've described.

> > I'm well aware of how systems use such files (as is clear by my
> > reference to SWI Prolog), but think that that use is by and large
> > misguided and sometimes harmful. The implementation burden reduction
> > is generally quite minimal, IMHO.
>
> In my experience the implementation burden without an OWL.owl file  
> would be immense

Well, we have to specify what we're implementing to be clear what the  
burden is. Some numbers might help.

People successfully build systems without it. Indeed, I would hazard  
that most OWL systems don't use it, even those based on RDF stores.  
The ones I know offhand are TBC and SWI Prolog.

Given the inherent incompleteness of such a file due to the expressive  
limitations of OWL or RDFS, it's hard to see how it would, in  
principle, be a big win. Compare with the XML Schema for OWL/XML  
which, while still not expressive enough to cover all of OWL's  
syntactic constraints, gets much closer. Compare with a BNF (for  
syntax).

Now, for the sort of application you support, the way you've  
implemented it, and the way you use the file, I rather suspect that an  
owl.owl file is helpful. But I think that's rather idiosyncratic. The  
SWRL tab is some evidence of how it can bring problems (because, for  
example, people believe it's the only way to implement SWRL support,  
or the best way, or they must use the central file even when buggy,  
etc.).

> and would significantly slow down any support for OWL 2 in our tools.

Well, heck, if it makes it *easier for you*, that's a relevant factor.  
Programming is heavily influenced by the comfort of the programmer in  
question. This is why I'm not fighting this per se. I want more OWL  
tools rather than fewer.

That's orthogonal to what I think about the general technical points  
and what I would recommend.

> So I do not share your view and claim the opposite (for our use  
> cases).

You don't think in some cases it's harmful? I don't see how that's  
supported by it working in your case. Similarly, it can work for you  
and still be misguided (all things considered).

Obviously it can be made to work...TBC and SWI Prolog are examples.

> > In any case, there's no need for a
> > central "canonical" version of the file in order for you to use this
> > implementation technique. Nothing stops TopBraid from using this  
> sort
> > of mechanism internally.
>
> This is what I would need to do if the work group would not want to  
> deliver such a file in a central place.

And why is this a problem? You also have to implement code that the  
working group does not deliver.

I get nervous about things people take as parallel specs. That hurts  
interop.

> Fortunately, I am not the only person with this request, and I  
> appreciate that Michael Schneider has already taken very good first  
> steps. I'd be happy to help with testing if this vocabulary moves  
> along.

Similarly, it seems to be perfectly reasonable for e.g., you and  
Michael to develop such a thing and open source it. Why does it need  
standardization esp. as it's really ersatz standardization?

(Again, the WG is going to do something here. I just don't see any  
good arguments for it.)

> > Indeed, I hope you cache your copy of owl.owl
> > instead of hitting the W3C server each time! (I would be shocked if
> > you didn't cache, but not everyone is conscientious.)
>
> Sure, we are using the cached file embedded in the Jena API.

That's good. Lack of caching of things like DTDs is an ongoing problem  
for the W3C. (Which is another reason I think such things are bad  
practice.)

(e.g., http://hsivonen.iki.fi/no-dtd/)

> > I don't find the linked data argument compelling as fundamental
> > enabling technology doesn't need to use distributed extensibility
> > mechanisms (unlike, for example, ad hoc vocabularies). This is not  
> an
> > uncommon view, nor is it is in any way in tension with the growth of
> > the web or the semantic web.
>
> My main point (and this is probably why I want to bring this thread  
> back to life) is that there is IMHO no fundamental difference  
> between the OWL vocabulary and any other ontology, such as SWRL,  
> FOAF or SIOC. OWL is an RDF vocabulary for defining classes and  
> properties.

Er...of course there is. (And swrl is on the OWL side.) I mean, the  
distinction between logical, builtin vocabulary and user vocabulary is  
a basic feature. Even with relatively uniform syntax (like triples)  
there's key distinctions both from a user and from a tool perspective.

And having some, key, vocabulary special cased is pretty harmless to  
an overall picture. After all, we do *standardize* stuff.  
Standardization is, almost essentially, a centralizing endeavor. We  
pick stuff we *want* to be standardized, and part of the  
infrastructure. So, there's no need to use mechanisms which are  
designed to accommodate a rather different situation.

> The instances of the OWL ontology are (mostly) classes and  
> properties. FOAF is a vocabulary for describing people. Many  
> Semantic Web tools such as TopBraid only require very minimal hard- 
> coding against specific ontologies, and in our case this is mostly  
> against RDF Schema. So the tool has some special handling of the  
> RDFS metaclasses, and - with the OWL system ontology - this is  
> sufficient for much of OWL support as well.

Not very much. As I've pointed out above.

I think you underestimate the amount of effort you've put in to making  
this work for drivimg TBC's interface. It would be interesting to  
analyze the source code to figure this out. If you'd like to  
experiment a bit with that I'd be happy to discuss that further. I.e.,  
how do we answer the questions:
1) What is owl.owl good for?
2) What infrastructure is necessary for it to be good for it?

> The beauty of having the OWL system ontology is that we (and other  
> tool developers) don't need to worry about all the "exotic" features  
> such as owl:ReflexiveProperty - as long as we know that this is a  
> property metaclass then the tool knows what to do with it. If users  
> want OWL 2 support, they would just add those triples. If they  
> prefer to stay in OWL 1, they would not.

If your support needs are truly that minimal, then it's hardly a  
burden to add it manually. So I'm confused again.

> Of course, the OWL vocabulary is fundamentally different from the  
> point of view of OWL tools such as tableau inference engines.

Take the ontology statistics example. Should every count of the number  
of classes in an ontology include rdfs:class? As a user and someone  
who studies users, I can't see that that's helpful at all.

> But as OWL RL nicely illustrates, even these things can often be  
> generalized further and reduced to generic rules that only require  
> RDF support and nothing else.

? OWL RL definitely cannot be implemented in RDF alone. You definitely  
need a rule language. Which is quite a substantial leap over RDF alone.

> My request for an OWL.owl file is in exactly the same spirit.

In the OWL RL document, there are some informative rules provided as a  
guide for implementors. I personally wouldn't recommend them as such  
for a production system. Consider the entailment rules in the RDF  
semantics document---they are *terrible* as an implementation if taken  
blindly. DIfferent rule engines have different characteristic anyway.

> > This is clearly a fairly strong technical disagreement, one which we
> > are unlikely to come to agreement on.
>
> I hope not. I don't see a strong technical disagreement here, just  
> different use cases.

Well, I hope our technical disagreement is relatively harmless. You  
certainly can do what I consider generically bad practice and do it  
with great success. As long as you don't serialize stuff with  
owl:imports owl.owl, we can largely interoperate without having to  
resolve our technical disagreements. Even then, I could work around  
that (by stripping them out).

We'll definitely offer different advice to new implementors...who will  
probably ignore us anyway :)

Cheers,
Bijan.
Received on Wednesday, 22 July 2009 00:01:38 UTC