Re: How to use notations from classification schemes in SKOS from Alistair Miles on 2006-02-15 (public-esw-thes@w3.org from February 2006)

From: Alistair Miles <a.j.miles@rl.ac.uk>
Date: Wed, 15 Feb 2006 17:12:53 +0000
To: public-esw-thes@w3.org
Message-ID: <43F36115.40406@rl.ac.uk>
Hi all,

Regarding the question of scope for SKOS Core ... that's why I'm keen 
for us to work on a use cases and requirements document :)

My personal feeling at the moment is that the basic motivating use cases 
for SKOS Core should all involve information retrieval using controlled, 
structured vocabularies. So thesauri, classification schemes, subject 
heading systems and taxonomies are in scope. Folksonomies are in a grey 
area. Terminologies and glossaries are out of scope (which doesn't mean 
that SKOS Core can't be used, it just means that there is no necessity 
for SKOS Core to support them).

I don't think it makes sense to divide thesauri from classification 
schemes, because the underlying mathematical and computational models 
describing retrieval systems that use them are extremely similar and can 
easily be generalised.

I don't know enough about subject heading systems yet, but I don't want 
to rule them out, if their fundamental purpose is the same (i.e. 
indexing and retrieval).

Regarding use of skos:prefLabel, the original intention was to use this 
property only to give lexical labels that are in fact words or 
collocations of words from some natural language. I.e. skos:prefLabel 
should always be used with a literal that has a language tag. Therefore 
I would suggest that, for classification schemes, captions be given via 
the skos:prefLabel property, even where two concepts in the same scheme 
have the same caption (see the note on integrity constraints below). The 
notation should be given via some other property (that's why 
skos:notation would be useful).

Note that some thesauri use both notations and preferred terms (see e.g. 
BS8723-2).

In the draft 'SKOS Core Integrity Testing and Quality Assurance for 
Instance Data' [1] I structured the tests in a very deliberate way ...

The 'constraint' that no resource may have more than one preferred label 
per language is expressed by test B1 (Preferred Lexical Label 
Cardinality). This is part of test group B (Labelling Integrity Tests).

The 'constraint' that no two resources in a scheme may have the same 
preferred label in any given language is expressed by test C2 (Preferred 
Lexical Label Uniqueness in Scheme). This is part of test group C 
(Controlled Vocabulary Labelling Integrity Tests).

Test groups A (Semantic Relation Integrity Tests) and B (Labelling 
Integrity Tests) are intended to capture semantics that are part of the 
interpretation of the SKOS Core Vocabulary, and can be applied in all 
contexts (i.e. for any use of SKOS Core). They are also tests that can 
meaningfully be applied in an open-world situation - i.e. they can be 
meaningfully applied to a graph that represents fragment of a concept 
scheme.

The 'Basic Integrity Test Case' consists of groups A and B.

Test groups C (Controlled Vocabulary Labelling Integrity Tests) and D 
(Scheme Structural Tests) are intended to capture semantics that are 
appropriate to some (but not all) uses of SKOS Core - e.g. the 
representation of a thesaurus.  I.e. they are optional, and are only 
useful in certain contexts. They are also only meaningful when applied 
to a graph that represents the whole of a concept scheme.

The 'Thesaurus Compatibility Test Case' consists of groups A, B, C and D.

This design is intended to handle the situation where some types of 
'concept scheme' legitimately allow two 'concepts' to have the same 
preferred label, whereas other types of 'concept scheme' don't allow this.

Cheers,

Al.

[1] 
http://isegserv.itd.rl.ac.uk/cvs-public/~checkout~/skos/drafts/integrity.html?rev=1.7


Stella Dextre Clarke wrote:
> Lars
>>> Therefore I would repeat Andy's warning that the notation in a 
>>> classification scheme corresponds to the preferred term 
>> (preflabel) in 
>>> a thesaurus. The captions do not correspond exactly to non-preferred
>>> terms, so I'm not sure whether it is a good idea to treat them as
>>> altlabel. I suppose it depends on how they are interpreted by the
>>> user, 
>>> or user application.
>> Well, they are alternatives, at least in a way. What name 
>> would you give them?
> I call them "captions". Some people call them "class headings". They do
> not behave in the same way as non-preferred terms because, as Leonard
> has pointed out, sometimes they need to be interpreted in the light of
> their parent class, and perhaps grandparent class. In other words, they
> are often incomplete as names or labels.
>>  
>>> Now, how to handle it in SKOS... back on my old hobby horse: if you 
>>> try and make one model work for several different types of 
>>> application, you
>>> may have to make the model quite complicated and you can end up with
>>> confusion.
>> If I understand you correctly, you're actually saying that we 
>> need a new ontology for describing classification schemes. In 
>> that case we need to redefine the skos quest, since "SKOS is 
>> an area of work developing specifications and standards to 
>> support the use of knowledge organisation systems (KOS) such 
>> as thesauri, *classification schemes*, subject heading lists, 
>> taxonomies, other types of controlled vocabulary, and perhaps 
>> also terminologies and glossaries, within the framework of 
>> the Semantic Web." [2] (emphasis mine).
> That "mission quest" was adopted only recently. Previously the scope had
> been narrower. I argued against the change at the time, but not enough
> people were persuaded. And I can see there is a benefit in having one
> scheme to cover all vocabulary types. But on the other hand, you lose
> some precision in describing any one of the vocabularies. So now I'll
> put the argument another way:
> 
> We already accept that ontologies are dealt with using OWL, rather than
> SKOS Core. So if you have a separate scheme for ontologies, and
> thesauri, why not also for classification schemes? Maybe Subject
> headings lists too? Terminologies and glossaries should certainly be
> separate, in my view. Now when you need to move between one of these
> vocabulary types and another, what we are doing is mapping. So perhaps
> SKOS Mapping, or an extension thereof, will handle it? Well that's just
> a passing thought, which I have not thought through. My province is the
> vocabularies themselves (with human users), and I leave it to the other
> clever people on this list to work out how machines can communicate
> them.
> 
> Over to you...
> Stella
> 
> *****************************************************
> Stella Dextre Clarke
> Information Consultant
> Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK
> Tel: 01235-833-298
> Fax: 01235-863-298
> SDClarke@LukeHouse.demon.co.uk
> *****************************************************
> 
> 
> 
> 
> 

-- 
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email: a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440
Received on Wednesday, 15 February 2006 17:13:17 UTC