SKOS comment: Last Call Working Draft from Panzer,Michael on 2008-10-03 (public-swd-wg@w3.org from October 2008)

From: Panzer,Michael <panzerm@oclc.org>
Date: Fri, 3 Oct 2008 14:10:52 -0400
To: <public-swd-wg@w3.org>
Message-ID: <AA3DCFAA4E87BD40BBAA507B1C36CC3DEA8C61@OAEXCH4SERVER.oa.oclc.org>
Dear SKOS working group,

Following are comments on the Last Call Working Draft of the SKOS
Reference with special emphasis on its ability to model classification
systems. Although examples are mainly drawn from the Dewey Decimal
Classification, we believe these issues to be valid for most large-scale
bibliographic classification systems, as they were echoed by a study of
using SKOS for a large integrated vocabulary, Chinese Classified
Thesaurus (CCT). As some problems might have already been stated in the
past, we apologize in advance for any duplication. More examples can be
provided if needed.


1. Non-assignable concepts
--------------------------

Classification systems usually contain objects that are, while not being
assignable concepts, nonetheless an integral part of the system (not
just a display/presentation device), e.g., number spans or - in case of
the DDC - so-called "centered entries":

T2-486-T2-488 Divisions of Sweden
333.7-333.9 Natural resources and energy  

A centered entry represents a subject covered by a span of numbers.
Centered entries relate notationally coordinate classes together as a
single concept. For example, T2-485 represents Sweden; the centered
entry T2-486-T2-488 represents the geographic divisions of Sweden.
Centered entries are an important part of the structural hierarchy,
representing true broader concepts, even though this superordination is
not indicated by notation. In addition, as they often contain
instructions applicable to all subordinate classes, centered entries
cannot be modeled as a skos:Collection, since skos:Collection cannot be
part of the concept hierarchy (as defined in 9.6.4). A new class or
expanded skos:Collection class is required to allow concept collections
like spans or centered entries to be expressed as concepts.

2. Index terms
--------------

An important part of many classification systems is an index, in the
case of the DDC its "Relative Index". Index terms associated with a
given class generally reflect several of the topics falling within the
scope of that class. There is no easy way of modeling this relationship
in SKOS:

Class/Concept:
616 Diseases

Index terms:
   Clinical medicine
   Diseases--humans--medicine
   Illness--medicine
   Internal medicine
   Physical illness--medicine
   Sickness--medicine

Currently, a possible workaround is to construct the complete Relative
Index as a separate skos:ConceptScheme and relate the concepts in these
two independent schemes by using mapping relations:

skosclass:hasIndexTerm rdfs:subPropertyOf skos:closeMatch .

skosclass:isIndexTermOf rdfs:subPropertyOf skos:closeMatch ;
  owl:inverseOf skosclass:hasIndexTerm .

<class/616> a skos:Concept ;
  skosclass:hasIndexTerm <index/Clinical%20medicine> ;
  skos:inScheme <classification> .

<index/Clinical%20medicine> a skos:Concept ;
  skosclass:isIndexTermOf <class/616> ;
  skos:inScheme <index> .

This seems to be a satisfactory best-practice solution in this case, but
it has broader implications as index terms are just one instance of:  

3. Class-Topic relationships
----------------------------

This issue seems to cause some general problems for using SKOS as a
general tool to model classification systems, since the fundamental
entity in a classification system is not the concept but the class, or,
more precisely, the distinction between classes and their subjects.
There are numerous examples of problems that arise by the difficulty of
expressing in SKOS the interplay between a class and the subjects that
form that class on the basis of at least one common characteristic.

The inability to model other than concept-concept relationships with
SKOS sometimes leads to inconsistencies as subjects/topics are
frequently in the domain or range of common classification
relationships.

In the DDC, this can manifest itself in classes being connected by both
hierarchical and non-hierarchical relationships if modeled with current
SKOS:

<A> skos:narrower <B> .
<B> skos:related <A> .

This arises because what is expressed here isn't really a relationship
between classes, but between topics and classes:

<A> ddc:narrower <B> .
<Topic_in_B> ddc:related <A> .

This pattern can also lead to circular hierarchical relationships:

<A> ddc:narrower <Topic_in_B> .
<B> ddc:narrower <Topic_in_A> .

At the moment in SKOS, this has to be coded at class level:

<A> skos:narrower <B> .
<B> skos:narrower <A> .

which produces inconsistencies. A possible solution would be to
introduce/define ddc:related (or similar relationships) as a new element
without extending SKOS semantic relationships, even if this would mean
lowering the utility of classification systems in SKOS applications.

4. skos:notation and skos:prefLabel are overlapping
---------------------------------------------------

There are two issues here: 1, Most notation in classification schemes is
preferred (i.e., standard) notation. Should both skos:notation and
skos:prefLabel be used for all these cases?

2, On some occasions an alternative (i.e., optional) notation is given
for a concept. For example, inScheme CCT:
        [Q89] environmental biology
            Preferred class: X17

Regardless whether it is preferred or alternative, the notation always
represents a unique concept and therefore has semantic relationships.
Hence, an alternative notation is not a non-preferred thesaurus label,
which has only lexical relationships.

5. Order in Classification Systems
----------------------------------

Order in a classification is important, indeed critical. Order is
evident in the juxtaposition of classes, the sequence of main classes,
and the sequence of co-ordinates in a class. Broader and narrower
relationships alone cannot represent order. So, maybe parallel encoding
is necessary to make sure that the system a classification scheme tries
to present is reflected when using SKOS.

To some degree, when order is connected to hierarchy, this can be
reflected by extensions to SKOS. The DDC for example has two parallel
hierarchies, one expressed by length of notation, the other by structure
(notes, etc.). This is handled at the moment by extending skos:narrower.

skosclass:narrowerStructural rdfs:subPropertyOf skos:narrower .

skosclass:broaderStructural rdfs:subPropertyOf skos:broader ;
  owl:inverseOf skosclass:narrowerStructural .

6. Mappings
-----------

The problem of restricting SKOS to one-to-one mappings has already been
raised as ISSUE-131. We share the concerns expressed there.

We also see potential problems in deriving the mapping relations
skos:broadMatch and skos:narrowMatch from skos:broader and
skos:narrower. In ISO standard and current practices many multilingual
thesauri did not use broader or narrower to indicate the mapping
relations. SKOS should revisit those standards and follow the current
standards' development to make sure SKOS is consistent in representing
the indicators used by standards (and the thesauri following those
standards) for so many years.  

In addition, when mapping systems that are structurally heterogeneous
(e.g., classification systems and thesauri), the links established
through mappings have no hierarchical implications at all.

Currently, skos:broader is used both for the hierarchical relationship
between classes as well as between concepts. Mapping relations that are
subproperties of skos:broader/skos:narrower are not able to sufficiently
support interoperability between structurally heterogeneous systems.

In addition, many different indicators of degree of mapping have been
used in integrated vocabularies, e.g., major mapping, minor mapping,
alternative mapping, and overlapping.  These may make the mapping
properties even more complicated. The solution here might again be to
extend mapping properties.

Best wishes,
Michael

--------------
Michael Panzer
Global Product Manager, Taxonomy Services
OCLC Online Computer Library Center, Inc.
Received on Friday, 3 October 2008 18:11:35 UTC