- From: Phil Tetlow <philip.tetlow@uk.ibm.com>
- Date: Tue, 8 Feb 2005 12:46:03 -0500
- To: "Bernard Vatant" <bernard.vatant@mondeca.com>, "SWBPD" <public-swbp-wg@w3.org>
- Cc: rector@cs.man.ac.uk
Bernard, FWIW... This is a very interesting issue and one that I think may well be at odds with some SE communities. I tend to agree with Alan that the notion 'secondary reasoning' based on classification schemes appears more relevant. I further consider that we should be careful when using the term 'index'. Some schools of thought would consider this to be a physical implementation of classification rather than logical, as I believe you originally intended? Also, I think it may well be worth mentioning that some new methodology ideas are starting to emerge around the concept of 'Aspects' (e.g. security is a particular nonfunctional aspect of a domain etc). Aspects are really, in my understanding, methods of decomposing and understanding a problem via multiple categorisation schemes. Using such an approach, one could argue that there is no real notion of 'primary', 'secondary' or any other order of classification. Instead all such schemes are equal until viewed in via very specific contexts, at which point their order other becomes implicit. Indeed, perhaps there may be some milage in adopting the term 'Aspect' as a label for such overlaying classification schemes? So, in the case of ICD, one could realistically think about the following aspects: 1.Clinical 2.Terminal 3.Tropical etc? Just an idea.... Phil Tetlow Senior Consultant IBM Business Consulting Services Mobile. (+44) 7740 923328 "Bernard Vatant" <bernard.vatant@m ondeca.com> To Sent by: "SWBPD" <public-swbp-wg@w3.org> public-swbp-wg-re cc quest@w3.org Subject RE: [OEP] [SE] Logical vs Indexing 08/02/2005 10:34 - multiple Types Alan I see we are basically on agreement. OK for the distinction between asserted vs inferred types. In my mind they were both under "logical types". OK also for the notion of "secondary reasoning" based on classification schemes. I had exactly the same kind of remark from my boss a few hours ago :) As for the vocabulary ... indexing or classification? I've used both, and I'm agnostic on the word itself. French people in the library world don't like to use in French "classification" in this context and prefer "indexation" (which does not exist in English, right?). In any case, we have to communicate correctly with the library community, whatever its natural language :)) Best Bernard -----Message d'origine----- De : Alan Rector [mailto:rector@cs.man.ac.uk] Envoyé : mardi 8 février 2005 14:48 À : Bernard Vatant Cc : SWBPD Objet : Re: [OEP] [SE] Logical vs Indexing - multiple Types Bernard This is an area where we have both experience and strong views. I think you actually need to distinguish three different sorts of Types - Asserted types. We would agree that any primitive should have only one asserted type. See Modularisation of domain ontologies written in description logics and OWL - Inferred types. One of the major reasons for using a classifier is to manage complex multi-hierarchies of defined types. These can be used for indexing but may also be used for other purposes. - Indexing classifications/Types. For us things such as the Medical Subject Headings (MeSH )from the library world but also things like the International Classification of Diseases, the Clinical Procedure Terminology, etc. These typically are constructed around "broader than"/"narrower than" lines and often have various idiosyncratic features, e.g. in MeSH the same string is found at the end of numerous paths, so the path is not an identifier whereas in ICD the identifier is the path. For classification types we would advocate indirect mapping rather than direct modelling - ie providing pointers to/from the ontology via annotation properties - a) because the internal structure of the classification type hierarchy typically follows different principles that, if imported into the ontology itself, cause confusion at best and contradictions at worst; and b) because it provides a hook for secondary reasoning. Using the inferred types as a framework for indexing classifications works very well. For this reason, I am not entirely happy with the phrase "indexing types". I would prefer "classification types" but I don't know how that fits in the library world. Regards Alan Bernard Vatant wrote: This is the follow-up of a debate which started last week on Protégé List[1] To sum it up, the starting point was demand from Protégé users to have ontology editors handle correctly multiple "rdf:type" declarations for the same instance, IOW : - Allow declaration of multiple types through the GUI, and further editing of such instances - Handle correctly multiple rdf:type declarations in imported RDF files So far, Protégé was allowing to import OWL files with multiple rdf:type declarations, but could neither edit them, nor create them through the GUI. Dealing with an instance of multiple classes in a GUI dynamically constructed from classes properties is not obvious (we have the same issue in Mondeca ITM). My first reaction was that it's certainly *not* a very good idea to push people to create multiple rdf:type for the same owl:Individual directly through ontology editing. Beyond technical difficulties of implementation, using multiple types, at least in a single source ontology, seems often to proceed from modeling confusion between "logical types" using formal declaration of named classes and various restrictions, and what I temptatively call "indexing types" which are defined by any property value (and could be explicited using "hasValue" restrictions). Logical types are intended for expression of logical structures, constraints, support processing and inference, software configuration, etc ... whereas indexing types are intended for display, navigation, and search. In other words, logical types represent the AI view of typing, whereas indexing types represent the librarian/documentalist view. Indexing types can be organized through various "schemes", as is currently discussed in the SKOS framework, and hierarchies that do not express logical subsumption, but various flavors of broader-narrower used by Thesauri, so-called Taxonomies and the like ... Both logical and indexing types are needed in most information architectures, but since there is no clear-cut way to use indexing types in RDF-OWL, nor any clear reference to them in the specs (because, IMO the authors of the specification came more from the AI world than from the librarian one), there is a trend in modeling practice to handle them using a lot of unnecessary logical types (the extreme case being total and thoughtless refactoring of thesauri into ontologies), leading to both crammed ontologies, and frequent need for multiple types. My thesis is [2] that the need for multiple types is more for indexing types than for logical ones, and therefore good practice should lead to very few (if any) cases of multiple logical types (at least in single-source ontologies, the question of multiple types appearing when merging ontologies being of course a tricky one which cannot be avoided). So I figure this group could produce some reflexions and maybe recommendations, from both modeling (OEP) and software engineering (SE) viewpoints, on the following points : - Logical types vs indexing types : Why, When, How to use either one. - Ways to practically use indexing types for display, navigation, sorting and query. - Use and abuse of multiple types - Best practices in software engineering to deal with indexing types, and multiple types I've started to tweak those questions together in a short paper, will publish it as soon as I can turn them into something consistent and readable, and volunteer to turn them into a proper draft note if there is any interest in it from either OEP or SE, or both (even if I am not formally member of any of those TF so far). Thanks for your interest Bernard [1] http://comments.gmane.org/gmane.comp.misc.ontology.protege.owl/8859 [2] Mike have asked on Protégé list if I had references to any literature on those issues to support my thesis, and I'm afraid I've not found proper sources so far. Keep searching, any pointers welcome (pro or con). ********************************************************************************** Bernard Vatant Senior Consultant Knowledge Engineering bernard.vatant@mondeca.com "Making Sense of Content" : http://www.mondeca.com "Everything is a Subject" : http://universimmedia.blogspot.com ********************************************************************************** -- Alan L Rector Professor of Medical Informatics Department of Computer Science University of Manchester Manchester M13 9PL, UK TEL: +44-161-275-6188/6149/7183 FAX: +44-161-275-6236/6204 Room: 2.88a, Kilburn Building email: rector@cs.man.ac.uk web: www.cs.man.ac.uk/mig www.opengalen.org www.clinical-escience.org www.co-ode.org
Received on Tuesday, 8 February 2005 17:42:23 UTC