Re: ISSUE-65 (excess vocab): REPORTED: excessive duplication of vocabulary

On Nov 21, 2007, at 6:25 AM, Boris Motik wrote:

> Hello,
>
> Not being able to parse an ontology separately (without looking at  
> imports) introduces many complications. I'll give you some examples.
>
> 1. In many cases, ontologies are not stored as files, but are  
> stored in databases. I don't mean here that the RDF triples are stored
> in a database. For example, KAON2 allows you to turn any existing  
> relational database into a bunch of ABox assertions by providing
> some mapping rules. Thus, a database-backed ontology in KAON2 does  
> not contain any tying triples explicitly at all. Similar features
> can be found in other ontology tools such as Ontobroker. These  
> ontologies are thus "virtual", in the sense that they don't reside in
> a file, but are generated at runtime by ontology management tools.

This is true, but then the mapping rules effectively carry all the  
typing information, no? And all we need is the typing information for  
properties.

> (BTW, this is one example why we should start thinking of  
> ontologies more in terms of an object model than something that is  
> saved
> in a file somewhere.)

I don't think of them as being in a file somewhere (I use a triple  
store for most of my work), and I don't mind object models as long as  
they don't get in the way. However, I note that in the last few years  
I've had to unlearn a lot about object models as they impose more  
restrictions on what can be said that OWL has, which is one of the  
reasons I like OWL.

> Imagine now that you have an OWL RDF ontology O that imports a  
> database-backed ontology O'. How do you disambiguate types on O then?
> Vocabulary typing is not supported by the structural specification;  
> hence, there is no way for O to ask O' a question of the form
> "what is the type of URI p?"

There is no protocol currently. However a few things come to mind. 1)  
The whole area of database mapping to OWL/RDF is somewhat in flux and  
not yet well defined. If we really intend to support this with OWL  
1.1 shouldn't it be on the same footing as the functional and RDF  
mapping? It seems to me that it is the same sort of thing.  2)  
Perhaps we need a protocol to do exactly what you suggest. 3) The  
cost for supporting this case is that there is an extra burden on all  
those who do not have this requirement. How do we balance these? 4)  
We can put property typing back in to the functional syntax.

> 2. Imagine that an OWL RDF ontology O imports an ontology O'  
> written in functional-style syntax. Again, there is no notion of  
> typing
> at the level of the structural specification; hence, O cannot ask  
> O' about types of objects.

There used to be. It was removed, and it can be put back. As you  
point out, there doesn't seem to be an articulated use case for  
punning object versus data properties (annotation properties allow  
both), so putting back in ObjectProperty, DatatypeProperty, and  
AnnotationProperty axioms wouldn't be problematic.

> 3. Not including vocabulary typing into the structural  
> specification allows us to keep punning in the structural  
> specification. I
> agree that punning may be undesirable at the RDF level; however,  
> there is really no need to prevent it at the level of the
> structural specification.

I am of the view that if it can't be said in RDF, then it makes no  
sense to say it in the structural specification. My feeling is that  
any feature that is not available in RDF syntax will not be widely  
adopted, and given that it has been hard to bring OWL a huge audience  
thus far, I'm not in favor of things that make it harder.  If there  
is disagreement about this design principle then we need to bring it  
up and resolve it in the group. To my mind this would be a rather  
large change to OWL and its place on the Semantic Web. Each time this  
has come up in the past, I've seen it resolved in favor of allowing  
RDF to express what is needed (sometimes I've been involved in the  
deliberations).

> 4. You might argue that we should extend the structural  
> specification with some notion of typing; however, I still believe  
> that
> parsing ontologies would be unnecessarily complex. Imagine if an  
> ontology O imports an ontology O' and vice versa; hence, Now have a
> cyclic dependency between O and O'. But then, this means that you  
> can't parse either ontology before the other one.

Yes, I just have argued that we should add back property type axioms.  
In order to resolve ambiguity in question, whether a property is data  
or object, we scan both ontologies for *only* the property axioms.  
This would b easy to do in both RDF and the functional syntax. For  
each property we have 1 of 3 cases - there is no typing axiom (OWL  
Full). There is 1 typing axiom (record for second pass). There are  
two typing axioms (OWL Full). What is complex about this?

> If both O and O' are in OWL RDF, you'd need to first load all the  
> triples into memory, resolve all typing issues, and then parse
> each ontology separately.

No, you need to scan the triples for type assertions. If you want to  
use little memory, then you use a disk hash to store the mapping of  
property names to property types. Why would you have to load   
everything into memory?

> But what if O' is written in some syntax other than RDF? Then, you  
> might need to first parse each ontology
> into some intermediate format, resolve all typing problems, and  
> only then generate the actual axioms.

The only syntaxes we have specified are the functional syntax and the  
RDF/XML syntaxes. If we have a solution for those isn't our job done?  
If we need to make stuff work for other syntaxes shouldn't they, or  
some property of them, be included in the specification?

> My question is whether all of this is really worth it. Being able  
> to parse an ontology by simply looking at the triples in that  
> ontology seems like a common-sense thing to do.

All things being equal, which they are not.

> It would provide much more freedom to implementations.

I'm still not seeing a lot more freedom. I am seeing a desire to be  
able to handle ontologies that are provided via providing a mapping  
to relational databases. If we need that, then let's put it in as a  
requirement and solve it.

> Finally, it is not really difficult to achieve: the only thing we  
> need is to make sure that each ontology contains typing triples for  
> each URI used as a property. I can't see why this would cause any  
> problems in practice.

I have already given examples: When we modularize ontologies by  
splitting them in to pieces, we have to have repeated typing  
statements in all files that use a module that defines a property.  
This repetition is error prone, causing one to have to edit many  
files to make a change, rather than a single one. This raises the  
real possibility (I have had it happen) of those type statements  
getting out of sync.  An example of a change would be the shift of a  
datatype property to an annotation property or vice versa.

Best,
Alan


>
> Regards,
>
> 	Boris
>
>
>> -----Original Message-----
>> From: public-owl-wg-request@w3.org [mailto:public-owl-wg- 
>> request@w3.org] On Behalf Of Alan Ruttenberg
>> Sent: 21 November 2007 10:58
>> To: Boris Motik
>> Cc: 'OWL Working Group WG'
>> Subject: Re: ISSUE-65 (excess vocab): REPORTED: excessive  
>> duplication of vocabulary
>>
>>
>>
>> On Nov 21, 2007, at 4:31 AM, Boris Motik wrote:
>>
>>> This makes parsing of OWL RDF really difficult: you can't process
>>> an ontology by itself, but you need
>>> to look at the imported ontologies as well.
>>>
>>> Even worse, what if the imported ontology O' is not in OWL RDF but
>>> in some other format? (For example, KAON2 allows a file ontology to
>>> import an ontology that resides in a relational database.) Parsing
>>> is now next to impossible. Thus, to allow parsing an ontology O by
>>> looking only at the triples in O, we introduced the typed  
>>> vocabulary.
>>
>> Hi Boris,
>>
>> One will need to read each imported ontology at least once in order
>> to reason over it. Why can the ontologies not be read in two passes,
>> instead of one. Online files can be cached locally for the second
>> pass. Indeed the information necessary to parse can be cached in the
>> common case that any ontology doesn't change, by indexing it against
>> the md5 of the file.
>>
>> I realize that this costs potentially 2n in the worst case, but in
>> the common case, with appropriate caches it will be quite quick...
>>
>> -Alan
>>
>
>

Received on Saturday, 24 November 2007 08:01:22 UTC