[OEP] [SE] Logical vs Indexing - multiple Types from Bernard Vatant on 2005-02-08 (public-swbp-wg@w3.org from February 2005)

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Tue, 8 Feb 2005 10:10:11 +0100
To: "SWBPD" <public-swbp-wg@w3.org>
Message-ID: <GOEIKOOAMJONEFCANOKCKEKNFIAA.bernard.vatant@mondeca.com>
This is the follow-up of a debate which started last week on Protégé List[1]

To sum it up, the starting point was demand from Protégé users to have ontology editors
handle correctly multiple "rdf:type" declarations for the same instance, IOW :
- Allow declaration of multiple types through the GUI, and further editing of such
instances
- Handle correctly multiple rdf:type declarations in imported RDF files

So far, Protégé was allowing to import OWL files with multiple rdf:type declarations, but
could neither edit them, nor create them through the GUI. Dealing with an instance of
multiple classes in a GUI dynamically constructed from classes properties is not obvious
(we have the same issue in Mondeca ITM).

My first reaction was that it's certainly *not* a very good idea to push people to create
multiple rdf:type for the same owl:Individual directly through ontology editing. Beyond
technical difficulties of implementation, using multiple types, at least in a single
source ontology, seems often to proceed from modeling confusion between "logical types"
using formal declaration of named classes and various restrictions, and what I
temptatively call "indexing types" which are defined by any property value (and could be
explicited using "hasValue" restrictions). Logical types are intended for expression of
logical structures, constraints, support processing and inference, software configuration,
etc ... whereas indexing types are intended for display, navigation, and search. In other
words, logical types represent the AI view of typing, whereas indexing types represent the
librarian/documentalist view. Indexing types can be organized through various "schemes",
as is currently discussed in the SKOS framework, and hierarchies that do not express
logical subsumption, but various flavors of broader-narrower used by Thesauri, so-called
Taxonomies and the like ...

Both logical and indexing types are needed in most information architectures, but since
there is no clear-cut way to use indexing types in RDF-OWL, nor any clear reference to
them in the specs (because, IMO the authors of the specification came more from the AI
world than from the librarian one), there is a trend in modeling practice to handle them
using a lot of unnecessary logical types (the extreme case being total and thoughtless
refactoring of thesauri into ontologies), leading to both crammed ontologies, and frequent
need for multiple types. My thesis is [2] that the need for multiple types is more for
indexing types than for logical ones, and therefore good practice should lead to very few
(if any) cases of multiple logical types (at least in single-source ontologies, the
question of multiple types appearing when merging ontologies being of course a tricky one
which cannot be avoided).

So I figure this group could produce some reflexions and maybe recommendations, from both
modeling (OEP) and software engineering (SE) viewpoints, on the following points :

- Logical types vs indexing types : Why, When, How to use either one.
- Ways to practically use indexing types for display, navigation, sorting and query.
- Use and abuse of multiple types
- Best practices in software engineering to deal with indexing types, and multiple types

I've started to tweak those questions together in a short paper, will publish it as soon
as I can turn them into something consistent and readable, and volunteer to turn them into
a proper draft note if there is any interest in it from either OEP or SE, or both (even if
I am not formally member of any of those TF so far).

Thanks for your interest

Bernard

[1] http://comments.gmane.org/gmane.comp.misc.ontology.protege.owl/8859
[2] Mike have asked on Protégé list if I had references to any literature on those issues
to support my thesis, and I'm afraid I've not found proper sources so far. Keep searching,
any pointers welcome (pro or con).


**********************************************************************************

Bernard Vatant
Senior Consultant
Knowledge Engineering
bernard.vatant@mondeca.com

"Making Sense of Content" :  http://www.mondeca.com
"Everything is a Subject" :  http://universimmedia.blogspot.com

**********************************************************************************
Received on Tuesday, 8 February 2005 09:10:28 UTC