- From: Alistair Miles <a.j.miles@rl.ac.uk>
- Date: Thu, 02 Mar 2006 14:28:50 +0000
- To: public-esw-thes@w3.org, SW Best Practices <public-swbp-wg@w3.org>
Hi all, The question of the semantics of the properties skos:prefLabel, skos:altLabel, skos:prefSymbol and skos:altSymbol has been raised in a number of contexts recently. There are some open issues here, and I would like to offer some discussion as a basis for raising relevant issues on the SKOS Core proposals and issues list. This discussion references the SKOS Core Integrity Testing and Quality Assurance draft [1], in addition to other resources. 1. Cardinality of skos:prefLabel The skos:prefLabel property is intended to be used to provide a *preferred lexical label* for a resource of any type. Obviously it doesn't make sense for more than one label to be 'preferred', so there is an implicit constraint on the skos:prefLabel property. This constraint is currently expressed in [2] by: (i) 'A concept should have no more than one preferred lexical label per language.' Because in fact the domain of skos:prefLabel is unconstrained, this should rather be: (ii) 'For any given natural language, a resource cannot have more than one preferred lexical label.' Informally speaking, this is a kind of qualified cardinality constraint. (The qualification is introduced because obviously we want to allow a resource to have a preferred label in each of multiple languages.) I consider (ii) to be a fundamental part of the semantics of the skos:prefLabel property. It is not possible to express this formally using RDF or OWL. It is, however, possible to express (ii) in a semi-formal way using SPARQL. The SPARQL pattern that, if matched, represents a violation of this constraint is given by: { ?x skos:prefLabel ?l; skos:prefLabel ?m. FILTER ( str(?l) != str(?m) && lang(?l) = lang(?m) ) } This pattern is used in test B.1. in [1]. 2. Disjointness of skos:prefLabel and skos:altLabel The skos:altLabel property is intended to be used to provide an 'alternative lexical label' for a resource of any type. Obviously it doesn't make sense for the same label to be both 'preferred' and 'alternative', so there is an implicit constraint on the combined usage of the skos:prefLabel and skos:altLabel properties. This is *not* currently expressed in [2]. This could be expressed in prose as: (iii) 'For any given natural language, a resource cannot have an alternative lexical label that is also the preferred lexical label.' Informally speaking, this is a kind of qualified disjointness between the skos:prefLabel and skos:altLabel properties. I consider (iii) to be a fundamental part of the semantics of the skos:prefLabel and skos:altLabel properties. It is not possible to express this formally using RDF or OWL. It is, however, possible to express (iii) in a semi-formal way using SPARQL. The SPARQL pattern that, if matched, represents a violation of this constraint is given by: { ?x skos:prefLabel ?l. ?x skos:altLabel ?m. FILTER ( str(?l) = str(?m) && lang(?l) = lang(?m) ) } This pattern is used in test B.3. in [1]. 3. Cardinality of skos:prefSymbol The skos:prefSymbol property is intended to be used to provide a 'preferred symbolic label' for a resource of any type, where a 'symbol' is a 'network retrievable image'. As with the skos:prefLabel property, it obviously doesn't make sense for a resource to have more than one preferred symbolic label. This constraint is *not* currently expressed in [2]. There are a number of difficulties that arise when trying to express this constraint either in prose or formally. The first difficulty involves symbolic languages. It is possible to imagine a situation where a resource has been labelled with symbols from more than one symbolic language. In this case, the cardinality constraint must be qualified by the language of the symbol, in an analagous way to the cardinality constraint on the skos:prefLabel property. This suggests that an appropriate way to express this constraint in prose might be: (iv) 'For any given symbolic language, a resource should not have more than one preferred symbolic label.' I consider (iv) to be a fundamental part of the semantics of the skos:prefSymbol property. SKOS does not, however, endorse any way of expressing the symbolic language to which a particular symbol belongs, and hence there is no clear way to express this either using OWL or using SPARQL. A very pragmatic way to expose a possible violation of this constraint would be to search for a match to the following SPARQL pattern: { ?x skos:prefSymbol ?n; skos:prefSymbol ?o. FILTER ( ?n != ?o ) } This pattern is used in test B.2. in [1]. Note however that this does not account for the possibility of symbols from different symbolic languages. Note also that, just because the URIs of the symbols are different does not necessarily mean that they denote different objects, because RDF does not assume unique names. Therefore, even if we ignore languages, a match of this pattern does not necessarily indicate a violation of the cardinality constraint. Hence the output of test B.2. is only a 'Warning' and not an 'Error'. 4. Disjointness of skos:prefSymbol and skos:altSymbol The skos:altSymbol property is intended to be used to provide an 'alternative symbolic label' for a resource of any type. Obviously it doesn't make sense for the same symbolic label to be both 'preferred' and 'alternative', so there is an implicit constraint on the combined usage of the skos:prefSymbol and skos:altSymbol properties. This constraint is analagous to the implicit constraint on the combined usage of skos:prefLabel and skos:altLabel. This is *not* currently expressed in [2]. This could be expressed in prose as: (v) 'For any given symbolic language, a resource cannot have an alternative symbolic label that is also the preferred symbolic label.' I consider (v) to be a fundamental part of the semantics of the skos:prefSymbol and skos:altSymbol properties. However, as mentioned above, SKOS does not endorse any way of expressing the symbolic language to which a particular symbol belongs, and hence there is no clear way to express this either using OWL or using SPARQL. A very pragmatic way to expose a possible violation of this constraint would be to search for a match to the following SPARQL pattern: { ?x skos:prefSymbol ?n. ?x skos:altSymbol ?n. } This pattern is used in test B.4. [1]. Note however that this does not account for the possibility of symbols from different symbolic languages. 5. Uniqueness of skos:prefLabel In a traditional thesaurus, each 'preferred term' is used to denote a distinct concept. This implies that, in a SKOS representation of a thesaurus, it is a serious problem if two or more concepts in the same concept scheme share the same preferred lexical label in any given natural language. This is currently expressed in [2] as: (vi) 'It is recommended that no two concepts in the same concept scheme be given the same preferred lexical label in any given language.' To exemplify this problem, consider the following SKOS data: ex:myThesaurus a skos:ConceptScheme. ex:A a skos:Concept; skos:prefLabel 'orange'@en; skos:scopeNote 'The colour orange.'@en; skos:inScheme ex:myThesaurus. ex:B a skos:Concept; skos:prefLabel 'orange'@en; skos:scopeNote 'A citrus fruit.'@en; skos:inScheme ex:myThesaurus. ex:C a skos:Concept; skos:prefLabel 'colour'@en; skos:narrower ex:A; skos:inScheme ex:myThesaurus. ex:D a skos:Concept; skos:prefLabel 'fruit'@en; skos:narrower ex:B; skos:inScheme ex:myThesaurus. Now I use this SKOS data to generate a traditional thesaurus-like representation of my thesaurus: fruit NT orange colour NT orange orange BT colour SN The colour orange. orange BT fruit SN A citrus fruit. Note that this situation *will not* necessarily cause a system error, if the system uses the URIs of the concepts as the means of reference, and if user interaction is mediated via 'clicking' and not via direct text input. However, this situation *will* cause a system error if the data is imported into a traditional thesaurus system that uses the preferred lexical label as the means of reference and ignores concept URIs. Note also that this situation *will* cause a social problem if the user interface through which the user interacts with the thesaurus does not present enough information to the user. E.g. if a web based user interface simply presents the word: orange as a hyperlink, without presenting any other information, the user has no way of disambiguating the overloaded meaning. However, if the user interface where to present something like: colour > orange fruit > orange as hyperlinks, there *will not* be a social problem because the user will be able to disambiguate. Therefore, (vi) might be better expressed as: (vii) 'For any given natural language, if two concepts in the same concept scheme have the same preferred lexical label, this will cause a serious problem for some software systems, for example a traditional thesaurus management system that is not aware of concept URIs. This will also lead to ambiguous usage if users are not presented with sufficient information to disambiguate between concepts with the same preferred lexical label.' I consider (vii) to be an *optional constraint* on the semantics of skos:prefLabel and skos:inScheme. I consider it optional because, under certain uses of SKOS Core, a violation of this constraint *will not* cause any problems. It is not possible to express this constraint in OWL. A pragmatic way to expose a violation of this constraint would be to search for a match to the following SPARQL pattern: { ?x skos:prefLabel ?l; skos:inScheme ?s. ?y skos:prefLabel ?m; skos:inScheme ?s. FILTER ( ?x != ?y && str(?l) = str(?m) && lang(?l) = lang(?m) ) } This pattern is used in test C.2. in [1]. Note that this test *is not* included in the 'Basic Integrity Test Case' but *is* included in the 'Thesaurus Compatibility Test Case' [1]. 6. Uniqueness of skos:prefSymbol All of the discussion given in point (5) above applies to the usage of skos:prefSymbol. I.e. under certain circumstances, if two concepts in the same concept scheme have the same preferred symbolic label, there will be a problem, but under other circumstances there won't be a problem. Currently [2] gives: (viii) 'It is recommended that no two concepts in the same concept scheme be given the same preferred symbolic label.' This might better be expressed as: (ix) 'For any given symbolic language, if two concepts in the same concept scheme have the same preferred symbolic label, this will cause a serious problem for some software systems. This will also lead to ambiguous usage if users are not presented with sufficient information to disambiguate between concepts with the same preferred symbolic label.' I consider (ix) to be an *optional constraint* on the semantics of skos:prefSymbol and skos:inScheme. I consider it optional because, under certain uses of SKOS Core, a violation of this constraint *will not* cause any problems. A pragmatic way to expose a violation of this constraint would be to search for a match to the following SPARQL pattern: { ?x skos:prefSymbol ?l; skos:inScheme ?s. ?y skos:prefSymbol ?l; skos:inScheme ?s. FILTER ( ?x != ?y ) } This pattern is used in test C.4. in [1]. Note that this pattern does not account for the possibility of more than one symbolic language. 7. Interaction of skos:prefLabel and skos:altLabel A traditional thesaurus does not allow a term to be both preferred and non-preferred. This implies that, in a SKOS representation of a thesaurus, it is a serious problem if the same literal is given as the preferred label of one concept and as an alternative label of another concept in the same concept scheme. This is not currently expressed in [2]. This could be expressed as: (x) 'For any given natural language, if the preferred lexical label of some concept is the same as an alternative lexical label of another concept in the same concept scheme, this will cause a serious problem for some software systems, for example a traditional thesaurus management system that is not aware of concept URIs. This will also lead to ambiguous usage if users are not presented with sufficient information to disambiguate between multiple uses of the same lexical label.' I consider (x) to be an *optional constraint*, for the reasons given in point (5) above. A pragmatic way to expose a violation of this constraint would be to search for a match to the following SPARQL pattern: { ?x skos:prefLabel ?l; skos:inScheme ?s. ?y skos:altLabel ?m; skos:inScheme ?s. FILTER ( ?x != ?y && str(?l) = str(?m) && lang(?l) = lang(?m) ) } This pattern is used in test C.1. in [1]. 8. Interaction of skos:prefSymbol and skos:altSymbol By analogy with lexical labels, if the same symbolic label is used as the preferred symbolic label of one concept and an alternative lexical label of another concept in the same concept scheme, this will cause problems in certain circumstances. This is not currently expressed in [2]. This could be expressed as: (xi) 'For any given symbolic language, if the preferred symbolic label of some concept is the same as an alternative symbolic label of another concept in the same concept scheme, this will cause a serious problem for some software systems. This will also lead to ambiguous usage if users are not presented with sufficient information to disambiguate between multiple uses of the same symbolic label.' A pragmatic way to expose a violation of this constraint would be to search for a match to the following SPARQL pattern: { ?x skos:prefSymbol ?l; skos:inScheme ?s. ?y skos:altSymbol ?l; skos:inScheme ?s. FILTER ( ?x != ?y ) } This pattern is used in test C.3. in [1]. --- The End :) Al. [1] http://isegserv.itd.rl.ac.uk/cvs-public/~checkout~/skos/drafts/integrity.html?rev=1.7 [2] http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20051102/ -- Alistair Miles Research Associate CCLRC - Rutherford Appleton Laboratory Building R1 Room 1.60 Fermi Avenue Chilton Didcot Oxfordshire OX11 0QX United Kingdom Email: a.j.miles@rl.ac.uk Tel: +44 (0)1235 445440
Received on Thursday, 2 March 2006 14:29:04 UTC