W3C home > Mailing lists > Public > public-esw-thes@w3.org > February 2009

Re: Unicity of prefLabel. Was: W3C SKOS Reference Last Call (3 days left in comment period)

From: Magnus Knuth <magnus.knuth@imise.uni-leipzig.de>
Date: Fri, 06 Feb 2009 17:35:42 +0100
Message-ID: <498C66DE.8090800@imise.uni-leipzig.de>
To: Christophe Dupriez <christophe.dupriez@destin.be>
CC: public-esw-thes@w3.org

Hello Christophe,

thanks for your view concerning that point.
In my opinion user side disambiguation does not justify **demanding** 
this rule in SKOS.

First, SKOS is concept based in contrast to thesauri, which is why it 
--kicks thesauris asses-- is smarter. Furthermore thesauri are not the 
only use case for SKOS. Distinguishing concepts does not necessarily 
rely on unique prefLabel, since there are other possibilities too. You 
are right, using the prefLabel non uniquely demands a mechanism to 
specify the "user side" disambiguation, which will be an application 
task and should be solved by the application developer. But as for now, 
having stated the unicity rule informally, enforcing this rule is an 
application task too.

Second, if a concept C is preferably denoted by an ambiguous term T, it 
is worth stating this fact as it is (C skos:prefLabel "T") and not 
workaround it and state it would preferably be denoted by an artificial 
construct (C skos:prefLabel "T_(D)"). Using such a construct makes it 
more difficult (or even impossible) to locate that preferred label 
somewhere else, e.g. in a text. Therefore I would agree on separating 
the disambiguation information from the prefLabel, e.g. using a 
skos:disambiguationNote datatype property, to preserve compatibility 
with term based systems and add a little meaning.

Concerning the "S" in SKOS: having this rule, does not simplify SKOS at all.

I do agree it is doubtful to satisfactorily automate the generation of 
disambiguation data, cos as you said there remain many problems 
unsolved. Especially when knowledge bases are evolving (adding 
information, merging with other KBs), formerly disambiguation 
information tend to become obsolete. Making clear what a concept means, 
might only be possible by a natural language definition. Why should 
"teamgeist" [1] not be a "football (sport)", it is a football and you 
are doing sport with it, you might not be able to distinguish as long as 
you don't know there is a "football (sport equipment)".

Greetinx

Magnus

[1] http://en.wikipedia.org/wiki/Adidas_Teamgeist


Christophe Dupriez schrieb:
> Hi Magnus,
>
> I have some difficulties with your proposal.
>
> If SKOS is what I believe it is (a data structure to support 
> applications when linking users needs to computerized ressources), you 
> need:
> 1) for the computer: precise identification (the unambiguous "about" 
> of each concept)
> 2) fot the user: a precise to choose the right concept.
>
> The data must allow to build choices so the user can identify the 
> concept (s)he desires. If you have:
> "Do you need:
> * football
> * football
> * football"
> the user is nowhere.
>
> With thesauri (which are term based, not concept based), it was usual 
> to add a disambiguation information between parenthesis. This is 
> straigthforward and clear:
> * football (sport equipment)
> * football (sport)
> * football (social phenomenon)
>
> If one wanted to automate the generation of disambiguation 
> information, it was also possible to use the scheme name: the 
> PrefLabel had then to be unique within a (micro-)thesaurus. This is 
> approximatively what SKOS proposes and it is not as good as "human 
> written" disambiguated labels.
>
> With your proposal, we may remain with nothing left: if the unicity 
> rule is relaxed, we need to describe a (simple?) mechanism to specify 
> the "user side" disambiguation data. From my point of view, we have to 
> keep the first "S" of "SKOS".
>
> Further considerations about disambiguation:
> * users tend to be limited in their ability of abstraction and in 
> their knowledge of specialized concepts.
> * classification creators tend to use top concepts linked to readily 
> known needs, to existing scientific domains and bottom concepts rooted 
> in the "daily" reality of the practitionners. The problem is with 
> middle of the hierarchy where often lies "artificial" concepts created 
> for classification purposes.
> * disambiguation must provide the user with something he can 
> discriminate (to choose a "football", you must know the difference 
> between "equipement", "sport" and "social phenomenon").
> * Selecting disambiguation information is therefore more difficult 
> than it seems and I doubt perfection can be obtained automatically.
>
> My suggestions:
> * Keep the unicity rule for now;
> * Transform the uniticy rule on prefLabel to unicity of 
> prefLabel+disambiguation information
> * Design a way to automate disambiguation based on very simple rules 
> (for instance "SchemeLabel" or "TopConceptLabel" or "BroadInSchemeLabel")
> * Provide "manual" ways to disambiguate:
>    ** "contextLabel" to specify a piece of text to disambiguate an 
> ambiguous PrefLabel
>    ** or a new semantical relation "uniqueUnder" which would link an 
> ambiguously labelled concept to one of its broader concepts well known 
> by most users
>
> Have a nice day!
>
> Christophe
>
> Magnus Knuth a écrit :
>>
>> Hello,
>>
>> I know I am a little late.
>>
>> We are working with SKOS for a while now using it so long mainly for 
>> topic hierarchies. I just read the Reference and Primer once again 
>> and found some points to mention.
>>
>>  * in the SKOS Primer (2008-08-29) Chap. 2.2.1 last sentence should 
>> tell "..., it is therefore recommended that no two concepts in the 
>> same KOS be given the same preferred lexical label --in any two given 
>> languages-- __in any given language__." (might be a typo)
>>
>> Well, we cannot apply this recommendation, since we sometimes have 
>> different concepts with identical prefLabel, e.g. ex:football_(sport) 
>> and ex:football_(ball) both having skos:prefLabel "football", i.e. we 
>> don't want to disambiguate these concept by their prefLabel but by 
>> further characteristics as their definition or altLabel.
>> Same would apply, when you collect alternative definitions for a term 
>> resulting in different concepts, e.g. the medical term "sepsis" is 
>> defined in various variants from "inflammatory infection" over "blood 
>> poisoning" to "infection with organ dysfunction", and medics use it 
>> in such various manner.
>>
>> So please leave it an informal recommendation.
>>
>>  * in the SKOS RDF (2008-05-xx and 2008-08-29) #prefLabel the comment 
>> should tell therefore "No two concepts in the same concept scheme 
>> --may-- __should__ have the same value for skos:prefLabel in a given 
>> language." (according RFC 2119)
>>
>> Kind regards
>>
>> Magnus.
>>
>>
>> Alistair Miles schrieb:
>>> Dear all,
>>>
>>> There are only 3 days left in the Last Call comment period for the
>>> Simple Knowledge Organization System SKOS Reference. For more
>>> information see the Last Call announcement at:
>>>
>>>   http://lists.w3.org/Archives/Public/public-esw-thes/2008Sep/0001.html
>>> Thanks to those who have already provided valuable feedback.
>>>
>>> Kind regards,
>>>
>>> Alistair.
>>>
>>>   
>
Received on Friday, 6 February 2009 16:36:24 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:39:03 GMT