RE: [SKOS] Possible issue: Uniqueness of skos:prefLabel [was Re: [SKOS] inconsistency between Guide and Specification

There are in fact two completely separate issues here, which may be
resolved independently.

I address these two issues below under separate headings.

 - Issue 1. Semantics of SKOS lexical label properties

There are several semantic conditions on the properties skos:prefLabel,
skos:altLabel and skos:hiddenLabel, which are implicit in the current
specifications [3,4].

The first of these is that, intuitively, only one label (per language
per script) can be "preferred". In other words, it does not make sense
for two things to both be "preferred".

The second of these is that, intuitively, it does not make sense for a
label to be both "preferred" and "alternative"; or both "preferred" and
"hidden"; or both "alternative" and "hidden".

The third of these is that, intuitively, if something is "alternative",
there must also be something that is "preferred".

These three conditions cannot be expressed using either RDFS or OWL.

They can, however, be expressed formally. In the wiki document at [5] I
follow the style used in the RDF Semantics to define the semantics of
SKOS as a *vocabulary extension* of RDFS. Conditions 1, 2 and 3 given in
[5] formally state the first, second and third conditions presented
informally above.

Note also that detecting the inconsistencies which may arise due to
these semantic conditions may be done using *existing RDF technology*
and without the necessity for a reasoning engine. SPARQL queries may be
used to find graph patterns which must represent a contradiction
according to these constraints.

These three conditions, therefore, are potentially useful, in that they
can be easily implemented as tests which will detect nonsensical data.

When it comes to agreeing on the formal semantics of SKOS, there is a
balance that needs to be struck between too much constraint, and not
enough. Too much constraint does not allow flexibility, which in turn
can stifle new applications. However, some constraint is useful, because
it can be used to improve the quality of data and hence the consistent
behaviour of applications. Supporting quality control is an extremely
important consideration that should not be neglected.

In this case I would like to argue that the first two conditions given
above (conditions 1 and 2 in [5]) are sensible, useful, and easily
implemented, and therefore should be part of the semantics of SKOS. 

Condition 3 is potentially useful as a constraint optionally interpreted
under a closed world assumption (i.e. "hey, you've missed a prefLabel
here!"), but the open-world inference rule that follows from the
condition would have no practical value (i.e. "there's an altLabel here,
so there must also be something that's a prefLabel, so I'll add a new
blank node..."). So, based on the utility under a closed-world
assumption, I tentatively argue for its inclusion in the semantics of
SKOS. Note that this too can be implemented as a SPARQL query, which
could for example generate a "warning" message to inform a data
publisher of a potential omission.

 - Issue 2. Labelling & thesaurus compatibility

This issue involves the use of the SKOS lexical labeling properties
skos:prefLabel, skos:altLabel and skos:hiddenLabel, when used in
conjunction with the SKOS concept scheme constructs (skos:inScheme and
skos:ConceptScheme).

As Guus points out:

> > in the Core Guide, section on Multilingual La belling [1]
> >
> > [[
> >   It is recommended that no two concepts in the same concept scheme
be
> > given the same
> >   preferred lexical label in any given language.
> > ]]
> >
> > in the Core Specification, table of prefLabel [2]
> >
> > [[
> >   No two concepts in the same concept scheme may have the same value
for
> > skos:prefLabel
> >   in a given language.
> > ]]

In some types of controlled vocabulary, it may be entirely reasonable
for two concepts in the same scheme to have the same preferred lexical
label (in some language/script). This is the case, for example, in some
classification schemes (where two "classes" may have the same "caption")
and corporate taxonomies (where two "nodes" may have the same "label"),
in which case either the notation or the context is used to disambiguate
meaning.

For this reason, I agree with Guus that these two sentences be dropped
from all future SKOS specifications, and that no formal conditions
should be placed on the use of the SKOS lexical labeling properties in
conjunction with the SKOS concept scheme constructs (currently
skos:inScheme and skos:ConceptScheme).

However, for a SKOS concept scheme to be *usable* as a thesaurus (i.e.
compatible with software following the ISO2788 standard) some
restrictions must be observed on the use of these properties in
conjunction.

If, within a given concept scheme, a given lexical label is used with
more than one concept *in any way*, this will render the concept scheme
incompatible with thesaurus software. So, for example, if "foo"@en is
used as the preferred lexical for two different concepts in the same
scheme; or if "foo"@en is used as a preferred label for one concept and
an alternative label for another; or if "foo"@en is used as an
alternative label for one concept and a hidden label for another, or ...
you get the idea.

Because of the importance of being able to identify compatibility with
existing thesaurus software and standards, I would like to argue that we
specify, informally, a set of restrictions which may be *optionally*
applied in order to detect thesaurus incompatibility. I.e. these
restrictions *would not* be part of the formal semantics of SKOS. 

These restrictions can be stated informally, with examples of SPARQL
queries that could be used to detect incompatibilities. Expressing these
restrictions formally is complicated and unnecessary. 

... I hope that helps to clarify :)

Cheers,

Alistair.

[3] http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20051102/
[4] http://www.w3.org/TR/2005/WD-swbp-skos-core-spec-20051102/
[5] http://www.w3.org/2006/07/SWD/wiki/SkosDesign/RdfsSemanticExtension

> -----Original Message-----
> From: public-swd-wg-request@w3.org
[mailto:public-swd-wg-request@w3.org]
> On Behalf Of Guus Schreiber
> Sent: 27 February 2007 12:01
> To: public-swd-wg@w3.org
> Cc: SWD WG
> Subject: [SKOS] Possible issue: Uniqueness of skos:prefLabel [was Re:
> [SKOS] inconsistency between Guide and Specification
> 
> 
> 
> 
> Guus Schreiber wrote:
> >
> > While trying to write down a resolution for the relationship between
> > labels I found:
> >
> > in the Core Guide, section on Multilingual La belling [1]
> >
> > [[
> >   It is recommended that no two concepts in the same concept scheme
be
> > given the same
> >   preferred lexical label in any given language.
> > ]]
> >
> > in the Core Specification, table of prefLabel [2]
> >
> > [[
> >   No two concepts in the same concept scheme may have the same value
for
> > skos:prefLabel
> >   in a given language.
> > ]]
> 
> I see no need for placing a constraint on the
> uniqueness of skos:prefLabel. While some/many
> vocabularies will actually abide to this, the URI
> of the concept the label is related already
> ensures uniqueness of the concept being identified
> (which I assume was the reason for including this
> constraint in the ISO spec). I also suggest that
> there is no need to place cardinality constraints
> on skos:prefLabel.
> 
> The underlying rationale is that we should refrain
> from overcommiting the SKOS specification when
> there is no clear need.
> 
> I want to raise this as an issue and propose the
> above as a resolution.
> 
> >
> > The weaker constraint in the Guide makes sense to me. I will most
likely
> > propose an even weaker version in my resolution.
> >
> > Guus
> >
> >
> >
> > [1]
http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20051102/#secmulti
> > [2] http://www.w3.org/TR/swbp-skos-core-spec/#prefLabel
> 
> --
> Vrije Universiteit Amsterdam, Computer Science
> De Boelelaan 1081a, 1081 HV Amsterdam, The Netherlands
> T: +31 20 598 7739/7718; F: +31 84 712 1446
> Home page: http://www.cs.vu.nl/~guus/

Received on Thursday, 1 March 2007 18:17:09 UTC