RE: ISSUE-160: Allowing collections in semantic relationships

Dear,

As far as Eurovoc is concerned, the following modelling can go as an extension.

Eurovoc
- it is one thesaurus (Concept scheme)
- It has several micro-thesauri (also concept scheme), 
  - ranging over a subset of the concepts in the overall thesaurus
  - for the concepts considered in the micro-thesaurus, no other associations hold than those present in the overall thesaurus
- The micro thesauri are grouped (actually partitioned) into domains

Structural extensions of skos required to model this
- a construct/constraint to validate that a micro-thesaurus is ranging over a subset of concepts of the overall thesaurus
  (example: the Concept group)
- a Domain as a superstructure of micro-thesauri
  (could be a "super" Concept group)

Typically in thesauri there are thesaurus array.
The difference as I understand it between a thesaurus arry and a concept group is that the concepts in an array are siblings
whereas the concepts in a concept group do not have this constraint.

Thanks for comments.

Johan De Smedt

-----Original Message-----
From: public-esw-thes-request@w3.org [mailto:public-esw-thes-request@w3.org] On Behalf Of Dupriez Christophe
Sent: Tuesday, 16 December, 2008 14:44
To: Aida Slavic; Thomas Baker
Cc: Antoine Isaac; public-swd-wg@w3.org; public-esw-thes@w3.org
Subject: Re: ISSUE-160: Allowing collections in semantic relationships


Hi Aida and Thomas,

First I see that my very first answer to Aida did not went thru:

--- Reaction to first Aida message:

I strongly support the position of Aida. We need a standard to represent correctly the proeminent features of what we have doing since the 80s. At least: Eurovoc which is a very good example of ISO 5964; MeSH which is the de-facto standard for all life sciences.

In a way, I would say the ISO standards (monolingual, multilingual thesaurus) which has always been a reference for all the profession + MeSH which is the most succesful big thesaurus are the MINIMA for SKOS.

Personnaly, I am happy with the concept of Collection to represent an arbitrary subset within a Scheme ("purpose" oriented). For example in a business system, "userLangage" can be the collection within the scheme "language" of the languages supported to interact with users.

Looking at the MeSH, there is an entity which looks like what you sometimes call a Collection: the Descriptor. The Descriptor is group of Concept (in the meaning of MeSH and SKOS) that are "blurred" together for indexing and retrieval purposes.
http://www.nlm.nih.gov/mesh/concept_structure.html
http://www.nlm.nih.gov/mesh/redefine.html
http://www.nlm.nih.gov/mesh/2009/download/xml_data_elements.html

Descriptors are put in a classification tree (broader/narrower hierarchies for indexing/retrieval purpose: not for "reasoning" purposes). Descriptors and their hierarchies are retrieval tools (for humans), not reasoning tools (for machines).

SKOS would definitively benefit of a structured work taking ISO standards and MeSH and then look at their direct, simple and future proof representation in SKOS. We must build on past practical experience.

I would like here to state what is for me the major difference between SKOS and OWL... SKOS is to provide control data for a tool which links users and applications (terms, translations, synonyms, indexing/retrieval hierarchies, classification linking users to concepts). OWL is to provide control data for software application decisions (logical relations between concepts).

If this is true, SKOS must provide the necessary data to "drive" the users from their representation of the world to the concepts managed by the computer application (and vice-versa: to expose the application in a meaningful way for users).

I work in a Poison Centre where those considerations are judged in the context of vital/urgent retrieval and analysis of information. We use thesauri for decades and we are looking to SKOS to make them future proof.

---- Following Thomas message:

I agree with you "in theory". The practician problem I have is that, unlike UniMARC and other libraries initiatives of the past, it is very difficult to find groups who work to create the DCMI profile for a given need. Also grammar of DC fields content is not precisely specified like what MARC+ISBD is providing.

I am working with medical articles (Medline XML is de facto standard), music records (not for sale, for selection by conductors), music scores and regular documents. I wanted to align my DC use to existing profiles but I did not found any group working on this. Finally, I made my own and I will adapt to any future standard using XSLT crosswalks. It is also not so difficult to change field names in DSpace applications.

With SKOS, we are looking to define a sizeable and consistent nucleus (able to cope with known needs) that can be enriched with RDF if one wants to address unforeseen needs. I used SKOS as a data model for an application integrated into DSpace and I am rather happy for now (live production will start in following weeks). It imports ConceptSchemes from SQL views, Tab delimited files, XML and export it to XML and through a Java API. I still have to add RIO to import/export RDF triples. But I have an XSD for an XML representation of a SKOS data structure (which is something one could want to standardize also). The XML files can be edited with JAXE for instance. Supporting RDF will allow my users to use Protege/SKOS.

Have a nice day,

Christophe

--- En date de : Mar 16.12.08, Thomas Baker <baker@sub.uni-goettingen.de> a écrit :

> De: Thomas Baker <baker@sub.uni-goettingen.de>
> Objet: Re: ISSUE-160: Allowing collections in semantic relationships
> À: "Dupriez Christophe" <christophe_dupriez@yahoo.fr>
> Cc: "Aida Slavic" <aida@acorweb.net>, "Antoine Isaac" <aisaac@few.vu.nl>, "public-swd-wg@w3.org" <public-swd-wg@w3.org>, "public-esw-thes@w3.org" <public-esw-thes@w3.org>
> Date: Mardi 16 Décembre 2008, 12h14
> Hi Christophe,
> 
> On Tue, Dec 16, 2008 at 09:59:56AM +0000, Dupriez
> Christophe wrote:
> > MARC is very complex, OK. Dublin Core has provided a
> lowest
> > common denominator for exchanges between human users.
> But
> > Dublin Core has forgotten many of MARC qualities
> (semantical
> > precision for instance) and has not really benefitted
> from
> > the knowledge of MARC pitfalls (semantical adequation
> of
> > data for foreseen real purposes). Dublin Core is
> correct for
> > "information discovery" but is now used for
> "information
> > management" which is a painful problem.
> 
> I wanted to point out that "Dublin Core" is more
> than a set
> of fifteen elements used with string values (a usage which
> is now referred to as "Simple Dublin Core").
> 
> The fifteen elements are part of a larger vocabulary
> "DCMI
> Metadata Terms" [1] which, as RDF properties and
> classes,
> are just as extensible as properties and classes in SKOS.
> A "Dublin Core application profile" [2] uses
> properties
> from RDF vocabularies, as needed, to address specific real
> purposes.  Most of the properties in DCMI Metadata Terms
> also
> have formally defined ranges -- more for purposes of
> machine
> processing than for exchanges between human users.
> 
> There is an interesting parallel between the design
> trade-offs
> described by Antoine with respect to the specificity or
> generic
> nature of SKOS and the specificity of the RDF vocabularies
> defined around the fifteen-element Dublin Core.  I do not
> believe there is a "perfect" balance between
> simplicity and
> complexity; rather, the solution lies in providing
> mechanisms
> for principled extensibility.
> 
> I'm not sure if this addresses your point about
> "semantical
> adequation of data", but the extensibility of the
> vocabularies plus the notion of mixed-vocabulary profiles
> means that profiles can be designed to be as complex or
> management-oriented as needed.
> 
> Tom (who also works with DCMI)
> 
> [1] http://dublincore.org/documents/dcmi-terms/ (see also
>     http://yoyodesign.org/doc/dcmi/dcmi-terms/ in French)
> [2]
> http://dublincore.org/documents/2008/01/14/singapore-framework/
> 
> -- 
> Tom Baker <tbaker@tbaker.de>


      

Received on Tuesday, 16 December 2008 14:06:33 UTC