Re: [PORT] Semantics of SKOS labelling properties

Hi all,

I've raised an item on the proposals list as a placeholder for this issue:

http://www.w3.org/2004/02/skos/core/proposals#labelSemantics-10

Cheers,

Al.

Alistair Miles wrote:
> 
> Hi all,
> 
> The question of the semantics of the properties skos:prefLabel, 
> skos:altLabel,
> skos:prefSymbol and skos:altSymbol has been raised in a number of 
> contexts recently. There are some
> open issues here, and I would like to offer some discussion as a basis 
> for raising relevant issues
> on the SKOS Core proposals and issues list. This discussion references 
> the SKOS Core Integrity
> Testing and Quality Assurance draft [1], in addition to other resources.
> 
> 
> 1. Cardinality of skos:prefLabel
> 
> The skos:prefLabel property is intended to be used to provide a 
> *preferred lexical label* for a
> resource of any type. Obviously it doesn't make sense for more than one 
> label to be 'preferred', so
> there is an implicit constraint on the skos:prefLabel property. This 
> constraint is currently
> expressed in [2] by:
> 
>  (i) 'A concept should have no more than one preferred lexical label per 
> language.'
> 
> Because in fact the domain of skos:prefLabel is unconstrained, this 
> should rather be:
> 
>  (ii) 'For any given natural language, a resource cannot have more than 
> one preferred lexical label.'
> 
> Informally speaking, this is a kind of qualified cardinality constraint. 
> (The qualification is
> introduced because obviously we want to allow a resource to have a 
> preferred label in each of
> multiple languages.)
> 
> I consider (ii) to be a fundamental part of the semantics of the 
> skos:prefLabel property. It is not
> possible to express this formally using RDF or OWL. It is, however, 
> possible to express (ii) in a
> semi-formal way using SPARQL. The SPARQL pattern that, if matched, 
> represents a violation of this
> constraint is given by:
> 
> {
>   ?x skos:prefLabel ?l; skos:prefLabel ?m.
>   FILTER ( str(?l) != str(?m) && lang(?l) = lang(?m) )
> }
> 
> This pattern is used in test B.1. in [1].
> 
> 
> 2. Disjointness of skos:prefLabel and skos:altLabel
> 
> The skos:altLabel property is intended to be used to provide an 
> 'alternative lexical label' for a
> resource of any type. Obviously it doesn't make sense for the same label 
> to be both 'preferred' and
> 'alternative', so there is an implicit constraint on the combined usage 
> of the skos:prefLabel and
> skos:altLabel properties. This is *not* currently expressed in [2]. This 
> could be expressed in prose as:
> 
>  (iii) 'For any given natural language, a resource cannot have an 
> alternative lexical label that is also the preferred lexical label.'
> 
> Informally speaking, this is a kind of qualified disjointness between 
> the skos:prefLabel and
> skos:altLabel properties. I consider (iii) to be a fundamental part of 
> the semantics of the
> skos:prefLabel and skos:altLabel properties. It is not possible to 
> express this formally using RDF
> or OWL. It is, however, possible to express (iii) in a semi-formal way 
> using SPARQL. The SPARQL
> pattern that, if matched, represents a violation of this constraint is 
> given by:
> 
> {
>   ?x skos:prefLabel ?l.
>   ?x skos:altLabel ?m.
>   FILTER ( str(?l) = str(?m) && lang(?l) = lang(?m) )
> }
> 
> This pattern is used in test B.3. in [1].
> 
> 
> 3. Cardinality of skos:prefSymbol
> 
> The skos:prefSymbol property is intended to be used to provide a 
> 'preferred symbolic label' for a
> resource of any type, where a 'symbol' is a 'network retrievable image'. 
> As with the skos:prefLabel
> property, it obviously doesn't make sense for a resource to have more 
> than one preferred symbolic
> label. This constraint is *not* currently expressed in [2].
> 
> There are a number of difficulties that arise when trying to express 
> this constraint either in prose or formally.
> 
> The first difficulty involves symbolic languages. It is possible to 
> imagine a situation where a
> resource has been labelled with symbols from more than one symbolic 
> language. In this case, the cardinality constraint must be qualified by 
> the language of the symbol, in an analagous way to the cardinality 
> constraint on the skos:prefLabel property. This suggests that an 
> appropriate way to express this constraint in prose might be:
> 
>  (iv) 'For any given symbolic language, a resource should not have more 
> than one preferred symbolic
> label.'
> 
> I consider (iv) to be a fundamental part of the semantics of the 
> skos:prefSymbol property. SKOS does
> not, however, endorse any way of expressing the symbolic language to 
> which a particular symbol
> belongs, and hence there is no clear way to express this either using 
> OWL or using SPARQL.
> 
> A very pragmatic way to expose a possible violation of this constraint 
> would be to search for a
> match to the following SPARQL pattern:
> 
> {
>   ?x skos:prefSymbol ?n; skos:prefSymbol ?o.
>   FILTER ( ?n != ?o )
> }
> 
> This pattern is used in test B.2. in [1]. Note however that this does 
> not account for the
> possibility of symbols from different symbolic languages. Note also 
> that, just because the URIs of
> the symbols are different does not necessarily mean that they denote 
> different objects, because RDF
> does not assume unique names. Therefore, even if we ignore languages, a 
> match of this pattern does
> not necessarily indicate a violation of the cardinality constraint. 
> Hence the output of test B.2. is
> only a 'Warning' and not an 'Error'.
> 
> 
> 4. Disjointness of skos:prefSymbol and skos:altSymbol
> 
> The skos:altSymbol property is intended to be used to provide an 
> 'alternative symbolic label' for a
> resource of any type. Obviously it doesn't make sense for the same 
> symbolic label to be both 'preferred' and 'alternative', so there is an 
> implicit constraint on the combined usage of the skos:prefSymbol and 
> skos:altSymbol properties. This constraint is analagous to the implicit 
> constraint on the combined usage of skos:prefLabel and skos:altLabel. 
> This is *not* currently expressed in [2]. This could be expressed in 
> prose as:
> 
>  (v) 'For any given symbolic language, a resource cannot have an 
> alternative symbolic label that is also the preferred symbolic label.'
> 
> I consider (v) to be a fundamental part of the semantics of the 
> skos:prefSymbol and skos:altSymbol properties. However, as mentioned 
> above, SKOS does not endorse any way of expressing the symbolic language 
> to which a particular symbol belongs, and hence there is no clear way to 
> express this either using OWL or using SPARQL.
> 
> A very pragmatic way to expose a possible violation of this constraint 
> would be to search for a
> match to the following SPARQL pattern:
> 
> {
>   ?x skos:prefSymbol ?n.
>   ?x skos:altSymbol ?n.
> }
> 
> This pattern is used in test B.4. [1]. Note however that this does not 
> account for the
> possibility of symbols from different symbolic languages.
> 
> 
> 5. Uniqueness of skos:prefLabel
> 
> In a traditional thesaurus, each 'preferred term' is used to denote a 
> distinct concept. This implies that, in a SKOS representation of a 
> thesaurus, it is a serious problem if two or more concepts in the same 
> concept scheme share the same preferred lexical label in any given 
> natural language. This is currently expressed in [2] as:
> 
>  (vi) 'It is recommended that no two concepts in the same concept scheme 
> be given the same preferred lexical label in any given language.'
> 
> To exemplify this problem, consider the following SKOS data:
> 
> ex:myThesaurus a skos:ConceptScheme.
> 
> ex:A a skos:Concept;
>   skos:prefLabel 'orange'@en;
>   skos:scopeNote 'The colour orange.'@en;
>   skos:inScheme ex:myThesaurus.
> 
> ex:B a skos:Concept;
>   skos:prefLabel 'orange'@en;
>   skos:scopeNote 'A citrus fruit.'@en;
>   skos:inScheme ex:myThesaurus.
> 
> ex:C a skos:Concept;
>   skos:prefLabel 'colour'@en;
>   skos:narrower ex:A;
>   skos:inScheme ex:myThesaurus.
> 
> ex:D a skos:Concept;
>   skos:prefLabel 'fruit'@en;
>   skos:narrower ex:B;
>   skos:inScheme ex:myThesaurus.
> 
> Now I use this SKOS data to generate a traditional thesaurus-like 
> representation of my thesaurus:
> 
> fruit
>   NT orange
> 
> colour
>   NT orange
> 
> orange
>   BT colour
>   SN The colour orange.
> 
> orange
>   BT fruit
>   SN A citrus fruit.
> 
> Note that this situation *will not* necessarily cause a system error, if 
> the system uses the URIs of the concepts as the means of reference, and 
> if user interaction is mediated via 'clicking' and not via direct text 
> input. However, this situation *will* cause a system error if the data 
> is imported into a traditional thesaurus system that uses the preferred 
> lexical label as the means of reference and ignores concept URIs.
> 
> Note also that this situation *will* cause a social problem if the user 
> interface through which the user interacts with the thesaurus does not 
> present enough information to the user. E.g. if a web based user 
> interface simply presents the word:
> 
> orange
> 
> as a hyperlink, without presenting any other information, the user has 
> no way of disambiguating the overloaded meaning. However, if the user 
> interface where to present something like:
> 
> colour > orange
> fruit > orange
> 
> as hyperlinks, there *will not* be a social problem because the user 
> will be able to disambiguate.
> 
> Therefore, (vi) might be better expressed as:
> 
>  (vii) 'For any given natural language, if two concepts in the same 
> concept scheme have the same preferred lexical label, this will cause a 
> serious problem for some software systems, for example a traditional 
> thesaurus management system that is not aware of concept URIs. This will 
> also lead to ambiguous usage if users are not presented with sufficient 
> information to disambiguate between concepts with the same preferred 
> lexical label.'
> 
> I consider (vii) to be an *optional constraint* on the semantics of 
> skos:prefLabel and skos:inScheme. I consider it optional because, under 
> certain uses of SKOS Core, a violation of this constraint *will not* 
> cause any problems.
> 
> It is not possible to express this constraint in OWL. A pragmatic way to 
> expose a violation of this constraint would be to search for a match to 
> the following SPARQL pattern:
> 
> {
>   ?x skos:prefLabel ?l; skos:inScheme ?s.
>   ?y skos:prefLabel ?m; skos:inScheme ?s.
>   FILTER ( ?x != ?y && str(?l) = str(?m) && lang(?l) = lang(?m) )
> }
> 
> This pattern is used in test C.2. in [1]. Note that this test *is not* 
> included in the 'Basic Integrity Test Case' but *is* included in the 
> 'Thesaurus Compatibility Test Case' [1].
> 
> 
> 6. Uniqueness of skos:prefSymbol
> 
> All of the discussion given in point (5) above applies to the usage of 
> skos:prefSymbol. I.e. under certain circumstances, if two concepts in 
> the same concept scheme have the same preferred symbolic label, there 
> will be a problem, but under other circumstances there won't be a problem.
> 
> Currently [2] gives:
> 
>  (viii) 'It is recommended that no two concepts in the same concept 
> scheme be given the same preferred symbolic label.'
> 
> This might better be expressed as:
> 
>  (ix) 'For any given symbolic language, if two concepts in the same 
> concept scheme have the same preferred symbolic label, this will cause a 
> serious problem for some software systems. This will also lead to 
> ambiguous usage if users are not presented with sufficient information 
> to disambiguate between concepts with the same preferred symbolic label.'
> 
> I consider (ix) to be an *optional constraint* on the semantics of 
> skos:prefSymbol and skos:inScheme. I consider it optional because, under 
> certain uses of SKOS Core, a violation of this constraint *will not* 
> cause any problems.
> 
> A pragmatic way to expose a violation of this constraint would be to 
> search for a match to the following SPARQL pattern:
> 
> {
>   ?x skos:prefSymbol ?l; skos:inScheme ?s.
>   ?y skos:prefSymbol ?l; skos:inScheme ?s.
>   FILTER ( ?x != ?y )
> }
> 
> This pattern is used in test C.4. in [1]. Note that this pattern does 
> not account for the possibility of more than one symbolic language.
> 
> 
> 7. Interaction of skos:prefLabel and skos:altLabel
> 
> A traditional thesaurus does not allow a term to be both preferred and 
> non-preferred. This implies that, in a SKOS representation of a 
> thesaurus, it is a serious problem if the same literal is given as the 
> preferred label of one concept and as an alternative label of another 
> concept in the same concept scheme. This is not currently expressed in 
> [2]. This could be expressed as:
> 
>  (x) 'For any given natural language, if the preferred lexical label of 
> some concept is the same as an alternative lexical label of another 
> concept in the same concept scheme, this will cause a serious problem 
> for some software systems, for example a traditional thesaurus 
> management system that is not aware of concept URIs. This will also lead 
> to ambiguous usage if users are not presented with sufficient 
> information to disambiguate between multiple uses of the same lexical 
> label.'
> 
> I consider (x) to be an *optional constraint*, for the reasons given in 
> point (5) above.
> 
> A pragmatic way to expose a violation of this constraint would be to 
> search for a match to the following SPARQL pattern:
> 
> {
>   ?x skos:prefLabel ?l; skos:inScheme ?s.
>   ?y skos:altLabel ?m; skos:inScheme ?s.
>   FILTER ( ?x != ?y && str(?l) = str(?m) && lang(?l) = lang(?m) )
> }
> 
> This pattern is used in test C.1. in [1].
> 
> 
> 8. Interaction of skos:prefSymbol and skos:altSymbol
> 
> By analogy with lexical labels, if the same symbolic label is used as 
> the preferred symbolic label of one concept and an alternative lexical 
> label of another concept in the same concept scheme, this will cause 
> problems in certain circumstances. This is not currently expressed in 
> [2]. This could be expressed as:
> 
>  (xi) 'For any given symbolic language, if the preferred symbolic label 
> of some concept is the same as an alternative symbolic label of another 
> concept in the same concept scheme, this will cause a serious problem 
> for some software systems. This will also lead to ambiguous usage if 
> users are not presented with sufficient information to disambiguate 
> between multiple uses of the same symbolic label.'
> 
> A pragmatic way to expose a violation of this constraint would be to 
> search for a match to the following SPARQL pattern:
> 
> {
>   ?x skos:prefSymbol ?l; skos:inScheme ?s.
>   ?y skos:altSymbol ?l; skos:inScheme ?s.
>   FILTER ( ?x != ?y )
> }
> 
> This pattern is used in test C.3. in [1].
> 
> ---
> 
> The End :)
> 
> Al.
> 
> [1] 
> http://isegserv.itd.rl.ac.uk/cvs-public/~checkout~/skos/drafts/integrity.html?rev=1.7 
> 
> [2] http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20051102/
> 

-- 
Alistair Miles
Research Associate
CCLRC - Rutherford Appleton Laboratory
Building R1 Room 1.60
Fermi Avenue
Chilton
Didcot
Oxfordshire OX11 0QX
United Kingdom
Email: a.j.miles@rl.ac.uk
Tel: +44 (0)1235 445440

Received on Thursday, 2 March 2006 15:19:30 UTC