Supporting arrays of concepts : node labels

In message
<B56ABE145BEB0C40A265238FCAA420DF01DC8E58@oa2-server.oa.oclc.org> on
Wed, 5 May 2004, "Houghton,Andrew" <houghtoa@oclc.org> wrote
>
>> From: Miles, AJ (Alistair) [mailto:A.J.Miles@rl.ac.uk]
>> Sent: Wednesday, May 05, 2004 2:21 PM
>> Subject: Supporting arrays of concepts
>>
>>
>> This is a strawman proposal for addition to the SKOS-Core schema:
>>
>> Some thesauri group concepts into ordered arrays, and label
>> the array, e.g.
>>
>>      People
>>                <people by age>
>>                Children (0-12 years)
>>                Teenagers (13-19 years)
>>                Adults (over 20 years)
>>
>> Since this sort of thing is common practise, and I believe  will be a part of
>>the new British standard for thesauri  (Leonard, Stella?), I thought we
>>ought to come up with a  mechanism for representing it as part of the
>>SKOS-Core vocab.

Yes, it is in the draft of the new standard. It would be good to have
software to handle it properly, as most existing packages are weak in
this area.

>> The problem is the best way to represent an ordered list in  RDF.  The
>>consensus so far seems to be for using RDF Lists  (collections).  The
>>other problem is how to connect an array  to the parent concept.  Such a
>>connection cannot replace the  skos:broader statements from the array
>>members, and must be  synchronised with them.

>This seems like what is called Node Labels and used by AAT and Dewey.
>Node Labels can be thought of as concepts that participate in the
>hierarchy structure but cannot be assigned as concepts.  In Dewey, for
>example, it has the notion of centered entries.  If you look at the printed
>edition these have a > (greater than sign) preceeding the class span.  You
>cannot assign them as a class number but they are present for the
>purposes of grouping the hierarchy, as in your example.  Node Labels
>have all the same relationships as concepts do, so many times they are
>represented as concepts.

Unfortunately, the expression "node label" has been used to mean
different things in different places. They should not be thought of as
concepts, because they do not represent concepts, and they do not have
scope notes or  any of the normal thesaurus relationships. When using
software that does not make proper provision for node labels it is
sometime necessary to treat them as concepts and give them BT/NT
relationships in order to display them in the proper place in a
hierarchy, but this is a fudge.

The AAT uses the expression "guide terms" rather than node labels, and
includes in this not only real node labels (as described at 1 below) but
also terms which represent real concepts but which it thinks are
inappropriate for use as indexing terms. This is confusing and
misleading; I think that any term used to describe a concept should be
potentially usable in indexing, though it can have the note "use a more
specific concept if possible".

In the draft British Standard we propose that there should be two kinds
of node label:

1. A node label showing a "characteristic of division". This is the kind
shown in the example above, and each label contains the word "by"
followed by the characteristic by which the elements of the following
array are distinguished. There may be several arrays under any term,
each introduced by a separate node label, e.g.

      people
          <people by age>
              children (0-12 years)
              teenagers (13-19 years)
              adults (over 20 years)

          <people by occupation>
          builders
          bus drivers
          information technologists
          information scientists
          librarians

          <people by sex>
          male people
              men
              boys
          female people
              women
              girls

and so on.

2. In a display of a classification, rather than a thesaurus, node
labels are used to show where a change of facet occurs, especially when
terms from different facets are being combined. They make it clear that
the relationship between the terms preceding and following the node
label is not BT/NT, but that the following classes are a compound of the
subsequent concepts with the preceding concept. In the following
example, the node labels containing the names of facets are given in
parentheses:

(organisms)
mammals [in general]
      carnivores [in general]
          leopards
          lions
          tigers
      herbivores [in general]
          cattle
          sheep
(processes)
      physiological processes [in general]
          digestion [in general]
              (organisms)
              [digestion in] carnivores
                  [digestion in] lions
              [digestion in] herbivores
                  [digestion in] cattle
                  [digestion in] sheep
          respiration [in general]
              (organisms)
              [respiration in] lions

The words in square brackets in this example are often omitted in
classification schedules, being implied by the indentation or
typography.

I take it that at the moment you are just addressing the issue of node
labels of type 1.

>SKOS currently doesn't take Node Label's into account with it's prefLabel
>and altLabel elements.  It is possible that a Node Label could have many
>different altLabel's.  I don't think that you need to add additional structure
>to represent Node Labels.  Perhaps, what is needed is to say that a
>concept must have either a group of prefLabel elements (xml:lang'ed) or a
>group of nodeLabel elements (xml:lang'ed) and can have any number of
>altLabel elements.  Since Node Labels will also have BT, NT, RT
>relationships, you will not need to duplicate that structure by reusing
>skos:Concept.

How this is implemented technically I'll leave to someone else, but I
think you have to be careful and not accept this paragraph literally (at
least if you accept our definition of node labels). As a node label is
not a label for a concept, there is no underlying concept to which
altLabels can be applied.

Node labels do not have BT, NT or RT relationships, except in the fudged
case I described above to make use of software without the required
functionality. The BT/NT relationship in effect "jumps over" the node
label, so that in the first example above the relationship is

people
NT    children
      teenagers
      adults
      builders

etc.,

and _not_

people
NT    <people by age>

etc.

Leonard
-- 
Willpower Information       (Partners: Dr Leonard D Will, Sheena E Will)
Information Management Consultants              Tel: +44 (0)20 8372 0092
27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276
L.Will@Willpowerinfo.co.uk               Sheena.Will@Willpowerinfo.co.uk
---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------

Received on Thursday, 6 May 2004 10:47:17 UTC