W3C home > Mailing lists > Public > public-swd-wg@w3.org > February 2009

Re: Unicity of prefLabel. Was: W3C SKOS Reference Last Call (3 days left in comment period)

From: Christophe Dupriez <christophe.dupriez@destin.be>
Date: Fri, 06 Feb 2009 18:27:20 +0100
Message-ID: <498C72F8.40304@destin.be>
To: Magnus Knuth <magnus.knuth@imise.uni-leipzig.de>, SKOS <public-esw-thes@w3.org>, "public-swd-wg@w3.org" <public-swd-wg@w3.org>
Thanks Magnus! (copy to SKOS community)

Concept based vs. Term based is a long lasting discussion: indexers 
probably prefer the first, linguists, lexicographs, etc. the second...
I am from the indexing world.

So what is the community opinion about   skos:disambiguationNote ?
I am rather enthusiastic about it and ready to implement in my stuff... 
like broadInScheme, narrowInScheme, relatedInScheme...

Anything has to be done to put this as candidate modifications for next 

One point remaining to discuss:
In a given scheme, for a given language, unicity should be checked for 
(prefLabel+disambiguationNote) ?
Do we agree?

Have a nice week-end!


Magnus Knuth a écrit :
> Hello Christophe,
> thanks for your view concerning that point.
> In my opinion user side disambiguation does not justify **demanding** 
> this rule in SKOS.
> First, SKOS is concept based in contrast to thesauri, which is why it 
> --kicks thesauris asses-- is smarter. Furthermore thesauri are not the 
> only use case for SKOS. Distinguishing concepts does not necessarily 
> rely on unique prefLabel, since there are other possibilities too. You 
> are right, using the prefLabel non uniquely demands a mechanism to 
> specify the "user side" disambiguation, which will be an application 
> task and should be solved by the application developer. But as for 
> now, having stated the unicity rule informally, enforcing this rule is 
> an application task too.
> Second, if a concept C is preferably denoted by an ambiguous term T, 
> it is worth stating this fact as it is (C skos:prefLabel "T") and not 
> workaround it and state it would preferably be denoted by an 
> artificial construct (C skos:prefLabel "T_(D)"). Using such a 
> construct makes it more difficult (or even impossible) to locate that 
> preferred label somewhere else, e.g. in a text. Therefore I would 
> agree on separating the disambiguation information from the prefLabel, 
> e.g. using a skos:disambiguationNote datatype property, to preserve 
> compatibility with term based systems and add a little meaning.
> Concerning the "S" in SKOS: having this rule, does not simplify SKOS 
> at all.
> I do agree it is doubtful to satisfactorily automate the generation of 
> disambiguation data, cos as you said there remain many problems 
> unsolved. Especially when knowledge bases are evolving (adding 
> information, merging with other KBs), formerly disambiguation 
> information tend to become obsolete. Making clear what a concept 
> means, might only be possible by a natural language definition. Why 
> should "teamgeist" [1] not be a "football (sport)", it is a football 
> and you are doing sport with it, you might not be able to distinguish 
> as long as you don't know there is a "football (sport equipment)".
> Greetinx
> Magnus
> [1] http://en.wikipedia.org/wiki/Adidas_Teamgeist
> Christophe Dupriez schrieb:
>> Hi Magnus,
>> I have some difficulties with your proposal.
>> If SKOS is what I believe it is (a data structure to support 
>> applications when linking users needs to computerized ressources), 
>> you need:
>> 1) for the computer: precise identification (the unambiguous "about" 
>> of each concept)
>> 2) fot the user: a precise to choose the right concept.
>> The data must allow to build choices so the user can identify the 
>> concept (s)he desires. If you have:
>> "Do you need:
>> * football
>> * football
>> * football"
>> the user is nowhere.
>> With thesauri (which are term based, not concept based), it was usual 
>> to add a disambiguation information between parenthesis. This is 
>> straigthforward and clear:
>> * football (sport equipment)
>> * football (sport)
>> * football (social phenomenon)
>> If one wanted to automate the generation of disambiguation 
>> information, it was also possible to use the scheme name: the 
>> PrefLabel had then to be unique within a (micro-)thesaurus. This is 
>> approximatively what SKOS proposes and it is not as good as "human 
>> written" disambiguated labels.
>> With your proposal, we may remain with nothing left: if the unicity 
>> rule is relaxed, we need to describe a (simple?) mechanism to specify 
>> the "user side" disambiguation data. From my point of view, we have 
>> to keep the first "S" of "SKOS".
>> Further considerations about disambiguation:
>> * users tend to be limited in their ability of abstraction and in 
>> their knowledge of specialized concepts.
>> * classification creators tend to use top concepts linked to readily 
>> known needs, to existing scientific domains and bottom concepts 
>> rooted in the "daily" reality of the practitionners. The problem is 
>> with middle of the hierarchy where often lies "artificial" concepts 
>> created for classification purposes.
>> * disambiguation must provide the user with something he can 
>> discriminate (to choose a "football", you must know the difference 
>> between "equipement", "sport" and "social phenomenon").
>> * Selecting disambiguation information is therefore more difficult 
>> than it seems and I doubt perfection can be obtained automatically.
>> My suggestions:
>> * Keep the unicity rule for now;
>> * Transform the uniticy rule on prefLabel to unicity of 
>> prefLabel+disambiguation information
>> * Design a way to automate disambiguation based on very simple rules 
>> (for instance "SchemeLabel" or "TopConceptLabel" or 
>> "BroadInSchemeLabel")
>> * Provide "manual" ways to disambiguate:
>>    ** "contextLabel" to specify a piece of text to disambiguate an 
>> ambiguous PrefLabel
>>    ** or a new semantical relation "uniqueUnder" which would link an 
>> ambiguously labelled concept to one of its broader concepts well 
>> known by most users
>> Have a nice day!
>> Christophe
>> Magnus Knuth a écrit :
>>> Hello,
>>> I know I am a little late.
>>> We are working with SKOS for a while now using it so long mainly for 
>>> topic hierarchies. I just read the Reference and Primer once again 
>>> and found some points to mention.
>>>  * in the SKOS Primer (2008-08-29) Chap. 2.2.1 last sentence should 
>>> tell "..., it is therefore recommended that no two concepts in the 
>>> same KOS be given the same preferred lexical label --in any two 
>>> given languages-- __in any given language__." (might be a typo)
>>> Well, we cannot apply this recommendation, since we sometimes have 
>>> different concepts with identical prefLabel, e.g. 
>>> ex:football_(sport) and ex:football_(ball) both having 
>>> skos:prefLabel "football", i.e. we don't want to disambiguate these 
>>> concept by their prefLabel but by further characteristics as their 
>>> definition or altLabel.
>>> Same would apply, when you collect alternative definitions for a 
>>> term resulting in different concepts, e.g. the medical term "sepsis" 
>>> is defined in various variants from "inflammatory infection" over 
>>> "blood poisoning" to "infection with organ dysfunction", and medics 
>>> use it in such various manner.
>>> So please leave it an informal recommendation.
>>>  * in the SKOS RDF (2008-05-xx and 2008-08-29) #prefLabel the 
>>> comment should tell therefore "No two concepts in the same concept 
>>> scheme --may-- __should__ have the same value for skos:prefLabel in 
>>> a given language." (according RFC 2119)
>>> Kind regards
>>> Magnus.
>>> Alistair Miles schrieb:
>>>> Dear all,
>>>> There are only 3 days left in the Last Call comment period for the
>>>> Simple Knowledge Organization System SKOS Reference. For more
>>>> information see the Last Call announcement at:
>>>> http://lists.w3.org/Archives/Public/public-esw-thes/2008Sep/0001.html
>>>> Thanks to those who have already provided valuable feedback.
>>>> Kind regards,
>>>> Alistair.

Received on Friday, 6 February 2009 17:24:56 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:31:55 UTC