- From: Alistair Miles <a.j.miles@rl.ac.uk>
- Date: Thu, 02 Mar 2006 15:19:08 +0000
- To: public-esw-thes@w3.org, SW Best Practices <public-swbp-wg@w3.org>
Hi all, I've raised an item on the proposals list as a placeholder for this issue: http://www.w3.org/2004/02/skos/core/proposals#labelSemantics-10 Cheers, Al. Alistair Miles wrote: > > Hi all, > > The question of the semantics of the properties skos:prefLabel, > skos:altLabel, > skos:prefSymbol and skos:altSymbol has been raised in a number of > contexts recently. There are some > open issues here, and I would like to offer some discussion as a basis > for raising relevant issues > on the SKOS Core proposals and issues list. This discussion references > the SKOS Core Integrity > Testing and Quality Assurance draft [1], in addition to other resources. > > > 1. Cardinality of skos:prefLabel > > The skos:prefLabel property is intended to be used to provide a > *preferred lexical label* for a > resource of any type. Obviously it doesn't make sense for more than one > label to be 'preferred', so > there is an implicit constraint on the skos:prefLabel property. This > constraint is currently > expressed in [2] by: > > (i) 'A concept should have no more than one preferred lexical label per > language.' > > Because in fact the domain of skos:prefLabel is unconstrained, this > should rather be: > > (ii) 'For any given natural language, a resource cannot have more than > one preferred lexical label.' > > Informally speaking, this is a kind of qualified cardinality constraint. > (The qualification is > introduced because obviously we want to allow a resource to have a > preferred label in each of > multiple languages.) > > I consider (ii) to be a fundamental part of the semantics of the > skos:prefLabel property. It is not > possible to express this formally using RDF or OWL. It is, however, > possible to express (ii) in a > semi-formal way using SPARQL. The SPARQL pattern that, if matched, > represents a violation of this > constraint is given by: > > { > ?x skos:prefLabel ?l; skos:prefLabel ?m. > FILTER ( str(?l) != str(?m) && lang(?l) = lang(?m) ) > } > > This pattern is used in test B.1. in [1]. > > > 2. Disjointness of skos:prefLabel and skos:altLabel > > The skos:altLabel property is intended to be used to provide an > 'alternative lexical label' for a > resource of any type. Obviously it doesn't make sense for the same label > to be both 'preferred' and > 'alternative', so there is an implicit constraint on the combined usage > of the skos:prefLabel and > skos:altLabel properties. This is *not* currently expressed in [2]. This > could be expressed in prose as: > > (iii) 'For any given natural language, a resource cannot have an > alternative lexical label that is also the preferred lexical label.' > > Informally speaking, this is a kind of qualified disjointness between > the skos:prefLabel and > skos:altLabel properties. I consider (iii) to be a fundamental part of > the semantics of the > skos:prefLabel and skos:altLabel properties. It is not possible to > express this formally using RDF > or OWL. It is, however, possible to express (iii) in a semi-formal way > using SPARQL. The SPARQL > pattern that, if matched, represents a violation of this constraint is > given by: > > { > ?x skos:prefLabel ?l. > ?x skos:altLabel ?m. > FILTER ( str(?l) = str(?m) && lang(?l) = lang(?m) ) > } > > This pattern is used in test B.3. in [1]. > > > 3. Cardinality of skos:prefSymbol > > The skos:prefSymbol property is intended to be used to provide a > 'preferred symbolic label' for a > resource of any type, where a 'symbol' is a 'network retrievable image'. > As with the skos:prefLabel > property, it obviously doesn't make sense for a resource to have more > than one preferred symbolic > label. This constraint is *not* currently expressed in [2]. > > There are a number of difficulties that arise when trying to express > this constraint either in prose or formally. > > The first difficulty involves symbolic languages. It is possible to > imagine a situation where a > resource has been labelled with symbols from more than one symbolic > language. In this case, the cardinality constraint must be qualified by > the language of the symbol, in an analagous way to the cardinality > constraint on the skos:prefLabel property. This suggests that an > appropriate way to express this constraint in prose might be: > > (iv) 'For any given symbolic language, a resource should not have more > than one preferred symbolic > label.' > > I consider (iv) to be a fundamental part of the semantics of the > skos:prefSymbol property. SKOS does > not, however, endorse any way of expressing the symbolic language to > which a particular symbol > belongs, and hence there is no clear way to express this either using > OWL or using SPARQL. > > A very pragmatic way to expose a possible violation of this constraint > would be to search for a > match to the following SPARQL pattern: > > { > ?x skos:prefSymbol ?n; skos:prefSymbol ?o. > FILTER ( ?n != ?o ) > } > > This pattern is used in test B.2. in [1]. Note however that this does > not account for the > possibility of symbols from different symbolic languages. Note also > that, just because the URIs of > the symbols are different does not necessarily mean that they denote > different objects, because RDF > does not assume unique names. Therefore, even if we ignore languages, a > match of this pattern does > not necessarily indicate a violation of the cardinality constraint. > Hence the output of test B.2. is > only a 'Warning' and not an 'Error'. > > > 4. Disjointness of skos:prefSymbol and skos:altSymbol > > The skos:altSymbol property is intended to be used to provide an > 'alternative symbolic label' for a > resource of any type. Obviously it doesn't make sense for the same > symbolic label to be both 'preferred' and 'alternative', so there is an > implicit constraint on the combined usage of the skos:prefSymbol and > skos:altSymbol properties. This constraint is analagous to the implicit > constraint on the combined usage of skos:prefLabel and skos:altLabel. > This is *not* currently expressed in [2]. This could be expressed in > prose as: > > (v) 'For any given symbolic language, a resource cannot have an > alternative symbolic label that is also the preferred symbolic label.' > > I consider (v) to be a fundamental part of the semantics of the > skos:prefSymbol and skos:altSymbol properties. However, as mentioned > above, SKOS does not endorse any way of expressing the symbolic language > to which a particular symbol belongs, and hence there is no clear way to > express this either using OWL or using SPARQL. > > A very pragmatic way to expose a possible violation of this constraint > would be to search for a > match to the following SPARQL pattern: > > { > ?x skos:prefSymbol ?n. > ?x skos:altSymbol ?n. > } > > This pattern is used in test B.4. [1]. Note however that this does not > account for the > possibility of symbols from different symbolic languages. > > > 5. Uniqueness of skos:prefLabel > > In a traditional thesaurus, each 'preferred term' is used to denote a > distinct concept. This implies that, in a SKOS representation of a > thesaurus, it is a serious problem if two or more concepts in the same > concept scheme share the same preferred lexical label in any given > natural language. This is currently expressed in [2] as: > > (vi) 'It is recommended that no two concepts in the same concept scheme > be given the same preferred lexical label in any given language.' > > To exemplify this problem, consider the following SKOS data: > > ex:myThesaurus a skos:ConceptScheme. > > ex:A a skos:Concept; > skos:prefLabel 'orange'@en; > skos:scopeNote 'The colour orange.'@en; > skos:inScheme ex:myThesaurus. > > ex:B a skos:Concept; > skos:prefLabel 'orange'@en; > skos:scopeNote 'A citrus fruit.'@en; > skos:inScheme ex:myThesaurus. > > ex:C a skos:Concept; > skos:prefLabel 'colour'@en; > skos:narrower ex:A; > skos:inScheme ex:myThesaurus. > > ex:D a skos:Concept; > skos:prefLabel 'fruit'@en; > skos:narrower ex:B; > skos:inScheme ex:myThesaurus. > > Now I use this SKOS data to generate a traditional thesaurus-like > representation of my thesaurus: > > fruit > NT orange > > colour > NT orange > > orange > BT colour > SN The colour orange. > > orange > BT fruit > SN A citrus fruit. > > Note that this situation *will not* necessarily cause a system error, if > the system uses the URIs of the concepts as the means of reference, and > if user interaction is mediated via 'clicking' and not via direct text > input. However, this situation *will* cause a system error if the data > is imported into a traditional thesaurus system that uses the preferred > lexical label as the means of reference and ignores concept URIs. > > Note also that this situation *will* cause a social problem if the user > interface through which the user interacts with the thesaurus does not > present enough information to the user. E.g. if a web based user > interface simply presents the word: > > orange > > as a hyperlink, without presenting any other information, the user has > no way of disambiguating the overloaded meaning. However, if the user > interface where to present something like: > > colour > orange > fruit > orange > > as hyperlinks, there *will not* be a social problem because the user > will be able to disambiguate. > > Therefore, (vi) might be better expressed as: > > (vii) 'For any given natural language, if two concepts in the same > concept scheme have the same preferred lexical label, this will cause a > serious problem for some software systems, for example a traditional > thesaurus management system that is not aware of concept URIs. This will > also lead to ambiguous usage if users are not presented with sufficient > information to disambiguate between concepts with the same preferred > lexical label.' > > I consider (vii) to be an *optional constraint* on the semantics of > skos:prefLabel and skos:inScheme. I consider it optional because, under > certain uses of SKOS Core, a violation of this constraint *will not* > cause any problems. > > It is not possible to express this constraint in OWL. A pragmatic way to > expose a violation of this constraint would be to search for a match to > the following SPARQL pattern: > > { > ?x skos:prefLabel ?l; skos:inScheme ?s. > ?y skos:prefLabel ?m; skos:inScheme ?s. > FILTER ( ?x != ?y && str(?l) = str(?m) && lang(?l) = lang(?m) ) > } > > This pattern is used in test C.2. in [1]. Note that this test *is not* > included in the 'Basic Integrity Test Case' but *is* included in the > 'Thesaurus Compatibility Test Case' [1]. > > > 6. Uniqueness of skos:prefSymbol > > All of the discussion given in point (5) above applies to the usage of > skos:prefSymbol. I.e. under certain circumstances, if two concepts in > the same concept scheme have the same preferred symbolic label, there > will be a problem, but under other circumstances there won't be a problem. > > Currently [2] gives: > > (viii) 'It is recommended that no two concepts in the same concept > scheme be given the same preferred symbolic label.' > > This might better be expressed as: > > (ix) 'For any given symbolic language, if two concepts in the same > concept scheme have the same preferred symbolic label, this will cause a > serious problem for some software systems. This will also lead to > ambiguous usage if users are not presented with sufficient information > to disambiguate between concepts with the same preferred symbolic label.' > > I consider (ix) to be an *optional constraint* on the semantics of > skos:prefSymbol and skos:inScheme. I consider it optional because, under > certain uses of SKOS Core, a violation of this constraint *will not* > cause any problems. > > A pragmatic way to expose a violation of this constraint would be to > search for a match to the following SPARQL pattern: > > { > ?x skos:prefSymbol ?l; skos:inScheme ?s. > ?y skos:prefSymbol ?l; skos:inScheme ?s. > FILTER ( ?x != ?y ) > } > > This pattern is used in test C.4. in [1]. Note that this pattern does > not account for the possibility of more than one symbolic language. > > > 7. Interaction of skos:prefLabel and skos:altLabel > > A traditional thesaurus does not allow a term to be both preferred and > non-preferred. This implies that, in a SKOS representation of a > thesaurus, it is a serious problem if the same literal is given as the > preferred label of one concept and as an alternative label of another > concept in the same concept scheme. This is not currently expressed in > [2]. This could be expressed as: > > (x) 'For any given natural language, if the preferred lexical label of > some concept is the same as an alternative lexical label of another > concept in the same concept scheme, this will cause a serious problem > for some software systems, for example a traditional thesaurus > management system that is not aware of concept URIs. This will also lead > to ambiguous usage if users are not presented with sufficient > information to disambiguate between multiple uses of the same lexical > label.' > > I consider (x) to be an *optional constraint*, for the reasons given in > point (5) above. > > A pragmatic way to expose a violation of this constraint would be to > search for a match to the following SPARQL pattern: > > { > ?x skos:prefLabel ?l; skos:inScheme ?s. > ?y skos:altLabel ?m; skos:inScheme ?s. > FILTER ( ?x != ?y && str(?l) = str(?m) && lang(?l) = lang(?m) ) > } > > This pattern is used in test C.1. in [1]. > > > 8. Interaction of skos:prefSymbol and skos:altSymbol > > By analogy with lexical labels, if the same symbolic label is used as > the preferred symbolic label of one concept and an alternative lexical > label of another concept in the same concept scheme, this will cause > problems in certain circumstances. This is not currently expressed in > [2]. This could be expressed as: > > (xi) 'For any given symbolic language, if the preferred symbolic label > of some concept is the same as an alternative symbolic label of another > concept in the same concept scheme, this will cause a serious problem > for some software systems. This will also lead to ambiguous usage if > users are not presented with sufficient information to disambiguate > between multiple uses of the same symbolic label.' > > A pragmatic way to expose a violation of this constraint would be to > search for a match to the following SPARQL pattern: > > { > ?x skos:prefSymbol ?l; skos:inScheme ?s. > ?y skos:altSymbol ?l; skos:inScheme ?s. > FILTER ( ?x != ?y ) > } > > This pattern is used in test C.3. in [1]. > > --- > > The End :) > > Al. > > [1] > http://isegserv.itd.rl.ac.uk/cvs-public/~checkout~/skos/drafts/integrity.html?rev=1.7 > > [2] http://www.w3.org/TR/2005/WD-swbp-skos-core-guide-20051102/ > -- Alistair Miles Research Associate CCLRC - Rutherford Appleton Laboratory Building R1 Room 1.60 Fermi Avenue Chilton Didcot Oxfordshire OX11 0QX United Kingdom Email: a.j.miles@rl.ac.uk Tel: +44 (0)1235 445440
Received on Thursday, 2 March 2006 15:19:30 UTC