RE: ISSUE-65 (excess vocab): REPORTED: excessive duplication of vocabulary

Hello,

Not being able to parse an ontology separately (without looking at imports) introduces many complications. I'll give you some
examples.


1. In many cases, ontologies are not stored as files, but are stored in databases. I don't mean here that the RDF triples are stored
in a database. For example, KAON2 allows you to turn any existing relational database into a bunch of ABox assertions by providing
some mapping rules. Thus, a database-backed ontology in KAON2 does not contain any tying triples explicitly at all. Similar features
can be found in other ontology tools such as Ontobroker. These ontologies are thus "virtual", in the sense that they don't reside in
a file, but are generated at runtime by ontology management tools.

(BTW, this is one example why we should start thinking of ontologies more in terms of an object model than something that is saved
in a file somewhere.)

Imagine now that you have an OWL RDF ontology O that imports a database-backed ontology O'. How do you disambiguate types on O then?
Vocabulary typing is not supported by the structural specification; hence, there is no way for O to ask O' a question of the form
"what is the type of URI p?"


2. Imagine that an OWL RDF ontology O imports an ontology O' written in functional-style syntax. Again, there is no notion of typing
at the level of the structural specification; hence, O cannot ask O' about types of objects.


3. Not including vocabulary typing into the structural specification allows us to keep punning in the structural specification. I
agree that punning may be undesirable at the RDF level; however, there is really no need to prevent it at the level of the
structural specification.


4. You might argue that we should extend the structural specification with some notion of typing; however, I still believe that
parsing ontologies would be unnecessarily complex. Imagine if an ontology O imports an ontology O' and vice versa; hence, Now have a
cyclic dependency between O and O'. But then, this means that you can't parse either ontology before the other one.

If both O and O' are in OWL RDF, you'd need to first load all the triples into memory, resolve all typing issues, and then parse
each ontology separately. But what if O' is written in some syntax other than RDF? Then, you might need to first parse each ontology
into some intermediate format, resolve all typing problems, and only then generate the actual axioms.



My question is whether all of this is really worth it. Being able to parse an ontology by simply looking at the triples in that
ontology seems like a common-sense thing to do. It would provide much more freedom to implementations. Finally, it is not really
difficult to achieve: the only thing we need is to make sure that each ontology contains typing triples for each URI used as a
property. I can't see why this would cause any problems in practice.

Regards,

	Boris

 
> -----Original Message-----
> From: public-owl-wg-request@w3.org [mailto:public-owl-wg-request@w3.org] On Behalf Of Alan Ruttenberg
> Sent: 21 November 2007 10:58
> To: Boris Motik
> Cc: 'OWL Working Group WG'
> Subject: Re: ISSUE-65 (excess vocab): REPORTED: excessive duplication of vocabulary
> 
> 
> 
> On Nov 21, 2007, at 4:31 AM, Boris Motik wrote:
> 
> > This makes parsing of OWL RDF really difficult: you can't process
> > an ontology by itself, but you need
> > to look at the imported ontologies as well.
> >
> > Even worse, what if the imported ontology O' is not in OWL RDF but
> > in some other format? (For example, KAON2 allows a file ontology to
> > import an ontology that resides in a relational database.) Parsing
> > is now next to impossible. Thus, to allow parsing an ontology O by
> > looking only at the triples in O, we introduced the typed vocabulary.
> 
> Hi Boris,
> 
> One will need to read each imported ontology at least once in order
> to reason over it. Why can the ontologies not be read in two passes,
> instead of one. Online files can be cached locally for the second
> pass. Indeed the information necessary to parse can be cached in the
> common case that any ontology doesn't change, by indexing it against
> the md5 of the file.
> 
> I realize that this costs potentially 2n in the worst case, but in
> the common case, with appropriate caches it will be quite quick...
> 
> -Alan
> 

Received on Wednesday, 21 November 2007 11:26:27 UTC