Re: SKOS vocabulary as 'simplified view' of OWL ontology - use case : geographical entities

Hi all

[Re-posting this message which was sent previously in a non-text 
list-unfriendly format. Got somehow "lost in configuration" when 
shifting recently to Mozilla Thunderbird ... sorry for the noise, and 
thanks to Tom who sent me a kind notice about it]

SKOS concepts are likeky to somehow 'match' similar individuals in other 
KOS, and singularly OWL ontologies. This question has already been 
discussed with no clear answer on what should be a 
sound/recommended/best practice. I'm currently working on a typical use 
case which will maybe help to illustrate this difficult issue, namely 
geographical-administrative entities. The project involves the French 
National Institute for Statistics and Economic Studies (INSEE) [1], 
which is the official provider of facts and figures concerning the 
country and its administrative subdivisions. The project aims to define 
a RDF representation of the involved entities, unformally but precisely 
defined by INSEE in its 'Code Officiel Géographique' [2]. Administrative 
structure of France being quite complex and multi-hierarchical, 
capturing it in a KOS is quite a challenge, and seems to need all the 
expressive power of OWL. The ontology of those entities relies on a 
backbone of 'administrative subdivision' relationships, and a bunch of 
constraints over those, such as : A 'department' subdivision is an 
instance of 'arrondissement', with a 'chef-lieu' (instance of 'city') 
which is either the unique department 'prefecture' or a 
'sous-prefecture' etc ... The ontology has also to be extensible to 
similar entities in other countries, supporting e.g. the European 
nomenclature of NUTS [3].
Such constraints are useful to control ontology integrity, but many 
'light-semantic' applications, such as search engines, will need 
actually only a simplified view of this ontology, with thesaurus-like 
relationships between entities, used for semantic expansion of search. 
For such uses, a SKOS representation of geographical entities and their 
hierarchy would be good enough, and such a representation could be 
proposed as a 'simplified view' of the ontology.
So the question is : what should be the (recommended) practices to 
provide such a simplification? Two main options :

    1. Don't do that! different uses, different semantics, different 
represenations. Entities in the SKOS representation should be defined 
independently of the entities in the OWL ontology, with different URIs 
supporting different semantics. A city is not a concept, having an 
individual both of type skos:Concept and a:Geo-entity is not a good idea.
    2. Do it for semantic integration : same individual, one URI. The 
OWL representation and the SKOS representation will not be used by the 
same applications anyway, so there is no practical risk in having 
individuals being declared of type skos:Concept in thesaurus-like 
vocabularies (SKOS), and of type a:Geo-entity in ontology-like 
vocabularies (OWL). Having  a single URI would be useful in an 
integrated environment using both indexing and search of documents 
indexed on geo-entities, and semantic query and inference on these entities.

Option 1 is safer, but raises the issue of semantic integration. How 
will I assert that this SKOS concept and that OWL entity are somehow 
representing the 'same' individual, and what is the meaning of this 
'same-ness'? I won't push again 'hubjects' here, although I could ;-) . 
Option 2 is my favorite those days, following the arguments pushed 
lately by Pat Hayes [4]. But what I wonder is to which extent the 
'simple' SKOS classes and properties should be tied to the 'complex' OWL 
classes and properties, for instance should we

     * Declare a:Geo-entity as a subclass of skos:Concept ?
     * Declare a:subdivisionOf as a subproperty of skos:broader ?
     * Declare a:neighbor as a subproperty of skos:related ?

Such declarations could be useful for OWL-to-SKOS migration, but are 
likely, if included in the OWL framework, to bring unsuspected and weird 
entailments ...

Any ideas/suggestions on this are welcome.

Bernard

[1] http://www.insee.fr/
[2] http://www.insee.fr/fr/nom_def_met/nomenclatures/cog/index.asp (in 
French)
[3] 
http://en.wikipedia.org/wiki/Nomenclature_of_Territorial_Units_for_Statistics
[4] http://lists.w3.org/Archives/Public/public-swbp-wg/2006Jan/0139.html

-- 

Bernard Vatant

Knowledge Engineering

Mondeca
3, cité Nollez 75018 Paris France

Tel. +33 (0) 871 488 459

Mail: bernard.vatant@mondeca.com

Web: www.mondeca.com

Blog : universimmedia.blogspot.com

Received on Friday, 17 March 2006 11:39:31 UTC