RE: [OEP] [SE] Logical vs Indexing - multiple Types from Phil Tetlow on 2005-02-08 (public-swbp-wg@w3.org from February 2005)

From: Phil Tetlow <philip.tetlow@uk.ibm.com>
Date: Tue, 8 Feb 2005 12:46:03 -0500
To: "Bernard Vatant" <bernard.vatant@mondeca.com>, "SWBPD" <public-swbp-wg@w3.org>
Cc: rector@cs.man.ac.uk
Message-ID: <OFFF04BE4E.A4FD9331-ON80256FA2.005F0E53-85257051.007CD072@uk.ibm.com>
Bernard,

FWIW...

This is a very interesting issue and one that I think may well be at odds
with some SE communities.

I tend to agree with Alan that the notion 'secondary reasoning' based on
classification schemes appears more relevant. I further consider that we
should be careful when using the term 'index'. Some schools of thought
would consider this to be a physical implementation of classification
rather than logical, as I believe you originally intended?

Also, I think it may well be worth mentioning that some new methodology
ideas are starting to emerge around the concept of 'Aspects' (e.g. security
is a particular nonfunctional aspect of a domain etc). Aspects are really,
in my understanding, methods of decomposing and understanding a problem via
multiple categorisation schemes. Using such an approach, one could argue
that there is no real notion of 'primary', 'secondary' or any other order
of classification. Instead all such schemes are equal until viewed in via
very specific contexts, at which point their order other becomes implicit.

Indeed, perhaps there may be some milage in adopting the term 'Aspect' as a
label for such overlaying classification schemes? So, in the case of ICD,
one could realistically think about the following aspects:
      1.Clinical
      2.Terminal
      3.Tropical etc?

Just an idea....

Phil Tetlow
Senior Consultant
IBM Business Consulting Services
Mobile. (+44) 7740 923328


                                                                           
             "Bernard Vatant"                                              
             <bernard.vatant@m                                             
             ondeca.com>                                                To 
             Sent by:                  "SWBPD" <public-swbp-wg@w3.org>     
             public-swbp-wg-re                                          cc 
             quest@w3.org                                                  
                                                                   Subject 
                                       RE: [OEP] [SE] Logical vs Indexing  
             08/02/2005 10:34          - multiple Types                    
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           
                                                                           






Alan

I see we are basically on agreement. OK for the distinction between
asserted vs inferred
types. In my mind they were both under "logical types". OK also for the
notion of
"secondary reasoning" based on classification schemes. I had exactly the
same kind of
remark from my boss a few hours ago :)

As for the vocabulary ... indexing or classification? I've used both, and
I'm agnostic on
the word itself. French people in the library world don't like to use in
French
"classification" in this context and prefer "indexation" (which does not
exist in English,
right?). In any case, we have to communicate correctly with the library
community,
whatever its natural language :))

Best

Bernard

-----Message d'origine-----
De : Alan Rector [mailto:rector@cs.man.ac.uk]
Envoyé : mardi 8 février 2005 14:48
À : Bernard Vatant
Cc : SWBPD
Objet : Re: [OEP] [SE] Logical vs Indexing - multiple Types


Bernard
This is an area where we have both experience and strong views.
I think you actually need to distinguish three different sorts of Types
-    Asserted types.  We would agree that any primitive should have only
one asserted
type. See Modularisation of domain ontologies written in description logics
and OWL
-    Inferred types.  One of the major reasons for using a classifier is to
manage complex
multi-hierarchies of
    defined types.  These can be used for indexing but may also be used for
other
purposes.
-    Indexing classifications/Types.  For us things such as the Medical
Subject Headings
(MeSH )from the library world but also things like the International
Classification of
Diseases, the Clinical Procedure Terminology, etc. These typically are
constructed around
"broader than"/"narrower than" lines and often have various idiosyncratic
features, e.g.
in MeSH the same string is found at the end of numerous paths, so the path
is not an
identifier whereas in ICD the identifier is the path.   For classification
types we would
advocate indirect mapping rather than direct modelling - ie providing
pointers to/from the
ontology via annotation properties -  a) because the internal structure of
the
classification type hierarchy typically follows different principles that,
if imported
into the ontology itself, cause confusion at best and contradictions at
worst; and b)
because it provides a hook for secondary reasoning.  Using the inferred
types as a
framework for indexing classifications works very well.  For this reason, I
am not
entirely happy with the phrase "indexing types".  I would prefer
"classification types"
but I don't know how that fits in the library world.
Regards
Alan

Bernard Vatant wrote:
This is the follow-up of a debate which started last week on Protégé
List[1]
To sum it up, the starting point was demand from Protégé users to have
ontology editors
handle correctly multiple "rdf:type" declarations for the same instance,
IOW :
- Allow declaration of multiple types through the GUI, and further editing
of such
instances
- Handle correctly multiple rdf:type declarations in imported RDF files
So far, Protégé was allowing to import OWL files with multiple rdf:type
declarations, but
could neither edit them, nor create them through the GUI. Dealing with an
instance of
multiple classes in a GUI dynamically constructed from classes properties
is not obvious
(we have the same issue in Mondeca ITM).
My first reaction was that it's certainly *not* a very good idea to push
people to create
multiple rdf:type for the same owl:Individual directly through ontology
editing. Beyond
technical difficulties of implementation, using multiple types, at least in
a single
source ontology, seems often to proceed from modeling confusion between
"logical types"
using formal declaration of named classes and various restrictions, and
what I
temptatively call "indexing types" which are defined by any property value
(and could be
explicited using "hasValue" restrictions). Logical types are intended for
expression of
logical structures, constraints, support processing and inference, software
configuration,
etc ... whereas indexing types are intended for display, navigation, and
search. In other
words, logical types represent the AI view of typing, whereas indexing
types represent the
librarian/documentalist view. Indexing types can be organized through
various "schemes",
as is currently discussed in the SKOS framework, and hierarchies that do
not express
logical subsumption, but various flavors of broader-narrower used by
Thesauri, so-called
Taxonomies and the like ...
Both logical and indexing types are needed in most information
architectures, but since
there is no clear-cut way to use indexing types in RDF-OWL, nor any clear
reference to
them in the specs (because, IMO the authors of the specification came more
from the AI
world than from the librarian one), there is a trend in modeling practice
to handle them
using a lot of unnecessary logical types (the extreme case being total and
thoughtless
refactoring of thesauri into ontologies), leading to both crammed
ontologies, and frequent
need for multiple types. My thesis is [2] that the need for multiple types
is more for
indexing types than for logical ones, and therefore good practice should
lead to very few
(if any) cases of multiple logical types (at least in single-source
ontologies, the
question of multiple types appearing when merging ontologies being of
course a tricky one
which cannot be avoided).
So I figure this group could produce some reflexions and maybe
recommendations, from both
modeling (OEP) and software engineering (SE) viewpoints, on the following
points :
- Logical types vs indexing types : Why, When, How to use either one.
- Ways to practically use indexing types for display, navigation, sorting
and query.
- Use and abuse of multiple types
- Best practices in software engineering to deal with indexing types, and
multiple types
I've started to tweak those questions together in a short paper, will
publish it as soon
as I can turn them into something consistent and readable, and volunteer to
turn them into
a proper draft note if there is any interest in it from either OEP or SE,
or both (even if
I am not formally member of any of those TF so far).
Thanks for your interest
Bernard
[1] http://comments.gmane.org/gmane.comp.misc.ontology.protege.owl/8859
[2] Mike have asked on Protégé list if I had references to any literature
on those issues
to support my thesis, and I'm afraid I've not found proper sources so far.
Keep searching,
any pointers welcome (pro or con).
**********************************************************************************

Bernard Vatant
Senior Consultant
Knowledge Engineering
bernard.vatant@mondeca.com
"Making Sense of Content" :  http://www.mondeca.com
"Everything is a Subject" :  http://universimmedia.blogspot.com
**********************************************************************************

--
Alan L Rector
Professor of Medical Informatics
Department of Computer Science
University of Manchester
Manchester M13 9PL, UK
TEL: +44-161-275-6188/6149/7183
FAX: +44-161-275-6236/6204
Room: 2.88a, Kilburn Building
email: rector@cs.man.ac.uk
web: www.cs.man.ac.uk/mig
        www.opengalen.org
        www.clinical-escience.org
        www.co-ode.org
Received on Tuesday, 8 February 2005 17:42:23 UTC