Re: [OEP] [SE] Logical vs Indexing - multiple Types from Alan Rector on 2005-02-08 (public-swbp-wg@w3.org from February 2005)

From: Alan Rector <rector@cs.man.ac.uk>
Date: Tue, 08 Feb 2005 13:48:16 +0000
To: Bernard Vatant <bernard.vatant@mondeca.com>
CC: SWBPD <public-swbp-wg@w3.org>
Message-ID: <4208C320.EFD7DF72@cs.man.ac.uk>
Bernard

This is an area where we have both experience and strong views.

I think you actually need to distinguish three different sorts of Types

-    Asserted types.  We would agree that any primitive should have only one asserted type.
See Modularisation of domain ontologies written in description logics and OWL

-    Inferred types.  One of the major reasons for using a classifier is to manage complex
multi-hierarchies of
    defined types.  These can be used for indexing but may also be used for other purposes.

-    Indexing classifications/Types.  For us things such as the Medical Subject Headings (MeSH
)from the library world but also things like the International Classification of Diseases, the
Clinical Procedure Terminology, etc. These typically are constructed around "broader
than"/"narrower than" lines and often have various idiosyncratic features, e.g. in MeSH the
same string is found at the end of numerous paths, so the path is not an identifier whereas in
ICD the identifier is the path.   For classification types we would advocate indirect mapping
rather than direct modelling - ie providing pointers to/from the ontology via annotation
properties -  a) because the internal structure of the classification type hierarchy typically
follows different principles that, if imported into the ontology itself, cause confusion at
best and contradictions at worst; and b) because it provides a hook for secondary reasoning.
Using the inferred types as a framework for indexing classifications works very well.  For
this reason, I am not entirely happy with the phrase "indexing types".  I would prefer
"classification types" but I don't know how that fits in the library world.

Regards

Alan


Bernard Vatant wrote:

> This is the follow-up of a debate which started last week on Protégé List[1]
>
> To sum it up, the starting point was demand from Protégé users to have ontology editors
> handle correctly multiple "rdf:type" declarations for the same instance, IOW :
> - Allow declaration of multiple types through the GUI, and further editing of such
> instances
> - Handle correctly multiple rdf:type declarations in imported RDF files
>
> So far, Protégé was allowing to import OWL files with multiple rdf:type declarations, but
> could neither edit them, nor create them through the GUI. Dealing with an instance of
> multiple classes in a GUI dynamically constructed from classes properties is not obvious
> (we have the same issue in Mondeca ITM).
>
> My first reaction was that it's certainly *not* a very good idea to push people to create
> multiple rdf:type for the same owl:Individual directly through ontology editing. Beyond
> technical difficulties of implementation, using multiple types, at least in a single
> source ontology, seems often to proceed from modeling confusion between "logical types"
> using formal declaration of named classes and various restrictions, and what I
> temptatively call "indexing types" which are defined by any property value (and could be
> explicited using "hasValue" restrictions). Logical types are intended for expression of
> logical structures, constraints, support processing and inference, software configuration,
> etc ... whereas indexing types are intended for display, navigation, and search. In other
> words, logical types represent the AI view of typing, whereas indexing types represent the
> librarian/documentalist view. Indexing types can be organized through various "schemes",
> as is currently discussed in the SKOS framework, and hierarchies that do not express
> logical subsumption, but various flavors of broader-narrower used by Thesauri, so-called
> Taxonomies and the like ...
>
> Both logical and indexing types are needed in most information architectures, but since
> there is no clear-cut way to use indexing types in RDF-OWL, nor any clear reference to
> them in the specs (because, IMO the authors of the specification came more from the AI
> world than from the librarian one), there is a trend in modeling practice to handle them
> using a lot of unnecessary logical types (the extreme case being total and thoughtless
> refactoring of thesauri into ontologies), leading to both crammed ontologies, and frequent
> need for multiple types. My thesis is [2] that the need for multiple types is more for
> indexing types than for logical ones, and therefore good practice should lead to very few
> (if any) cases of multiple logical types (at least in single-source ontologies, the
> question of multiple types appearing when merging ontologies being of course a tricky one
> which cannot be avoided).
>
> So I figure this group could produce some reflexions and maybe recommendations, from both
> modeling (OEP) and software engineering (SE) viewpoints, on the following points :
>
> - Logical types vs indexing types : Why, When, How to use either one.
> - Ways to practically use indexing types for display, navigation, sorting and query.
> - Use and abuse of multiple types
> - Best practices in software engineering to deal with indexing types, and multiple types
>
> I've started to tweak those questions together in a short paper, will publish it as soon
> as I can turn them into something consistent and readable, and volunteer to turn them into
> a proper draft note if there is any interest in it from either OEP or SE, or both (even if
> I am not formally member of any of those TF so far).
>
> Thanks for your interest
>
> Bernard
>
> [1] http://comments.gmane.org/gmane.comp.misc.ontology.protege.owl/8859
> [2] Mike have asked on Protégé list if I had references to any literature on those issues
> to support my thesis, and I'm afraid I've not found proper sources so far. Keep searching,
> any pointers welcome (pro or con).
>
> **********************************************************************************
>
> Bernard Vatant
> Senior Consultant
> Knowledge Engineering
> bernard.vatant@mondeca.com
>
> "Making Sense of Content" :  http://www.mondeca.com
> "Everything is a Subject" :  http://universimmedia.blogspot.com
>
> **********************************************************************************

--
Alan L Rector
Professor of Medical Informatics
Department of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL: +44-161-275-6188/6149/7183
FAX: +44-161-275-6236/6204
Room: 2.88a, Kilburn Building
email: rector@cs.man.ac.uk
web: www.cs.man.ac.uk/mig
        www.opengalen.org
        www.clinical-escience.org
        www.co-ode.org
Received on Tuesday, 8 February 2005 13:46:50 UTC