Re: SKOS and MeSH qualifiers

In message <Pine.CYG.4.58.0507110752530.34504@johndrake> on Mon, 11 Jul 
2005, Robert Watkins <rwatkins@foo-bar.org> wrote
>On Mon, 11 Jul 2005, Leonard Will wrote:
>
>> Terms with subdivisions or MeSH-type "qualifiers" are really
>> pre-coordinated strings made up of two or more thesaurus concepts. The
>> syntax and the allowed combinations for these strings would normally be
>> incorporated in rules for pre-coordination which are separate from the
>> elementary concepts and their labels in the thesaurus itself.
>>
>> [ snipped ]
>>
>> I would suggest that all concepts such as these should be recorded
>> independently as separate entries in the thesaurus and be available for
>> assignment to documents in addition to other "subject" terms.
>>
>> [ snipped ]
>>
>Perhaps I am missing something otherwise self-evident in Leonard
>Will's assessment of this issue, but with MeSH, the "qualifiers" are
>not assigned to documents but to specific terms assigned to a document.
>For example, a document might be indexed with the following MeSH terms,
>with the appropriate qualifiers following in brackets:
>
>  Articulation Disorders [etiology;therapy]
>  Down Syndrome [complications]
>  Leukemia, Lymphocytic, Acute [complications;drug therapy;pathology]
>
>As such, the document could certainly be found by searching for the
>qualifier "complications", but a post-coordinated search for "Articulation
>Disorders" restricted by the MeSH qualifier "complications" would be
>inappropriate.
>
>If this can be done in SKOS using the concepts of facets
>(owl:Restriction?)  then so much the better, but I don't have enough
>experience to see it.

The problem that Robert raises is a recognised one in post-coordinate 
searching, where "false drops" (invalid hits) can be produced by the 
inappropriate linking of concepts that belong to distinct subject 
strings.

This can only be overcome by searching for the complete strings:

Articulation Disorders : etiology
Articulation Disorders : therapy
Down Syndrome : complications
and so on.

The underlying issue is whether SKOS is to attempt to provide for 
pre-coordinated strings of concepts like these, as found in 
classification schemes and systems of alphabetical subject headings like 
MeSH and LCSH. As I understand it, at present SKOS was constructed to 
cater only for thesauri, where each concept is independent and linked 
only by "paradigmatic" relationships (which apply irrespective of 
context).

The relationships used to create pre-coordinated strings are 
"syntagmatic", i.e. they exist only because the two concepts occur 
together in a document being indexed. There is no _inherent_ 
relationship between drug therapy and leukemia, but a document may well 
deal with these two concepts together.

I said that if SKOS is to handle such strings, there needs to be 
somewhere to store the rules of syntax which control they way they are 
put together. Alistair's suggestion of giving each concept the 
sub-property mesh:allowedQualifier is one way of achieving this, but 
only for strings of two terms where the order is specified.

A complication is that many terms which exist as MeSH qualifiers also 
exist as MeSH descriptors (e.g. economics, education, drug therapy, ... 
). Some concepts (e.g. adverse effects) exist only as qualifiers.

I think that it is undesirable to have the same concept occurring twice 
in a scheme, once as a descriptor and once as a qualifier. Do we 
therefore have to have an additional property attached to those which 
are only to be used as qualifiers, specifying something like "do not use 
as the first-cited, or only, term in a string".

Sorry if this seems to be introducing unnecessary complications, but I 
am trying to generalise so that SKOS will be able to cope with any kind 
of pre-coordinated scheme rather than just adopting an ad-hoc solution 
to meet the specific needs of a single scheme such as MeSH.

Leonard Will

-- 
Willpower Information       (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants              Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will@Willpowerinfo.co.uk               Sheena.Will@Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------

Received on Monday, 11 July 2005 14:10:27 UTC