- From: Leonard Will <L.Will@willpowerinfo.co.uk>
- Date: Mon, 1 Mar 2004 17:29:03 +0000
- To: public-esw-thes@w3.org
In message <AAEKLFPLCPPCFCOACDKIGEKMCJAA.aida@acorweb.net> on Mon, 1 Mar 2004, Aida Slavic <aida@acorweb.net> wrote > >Leonard, > >My concern with any definition we would accept is related to the functionality >this may imply Aida Yes, I agree. We should probably practice what we preach, and define these concepts in terms of their functionality. If in doing so we find that we are talking about more than one distinct concept, then we need to find distinct names for them. >I have in mind the following: > >1) facets in expressing semantic (logical hierarchy and poly-hierarchy) >here is where the issue of facet/array and inheritance comes > >- a little digression may be relevant here: >Svenonius suggested that for m2m handling of vocabulary there should be >provision for indicating the difference between hierarchy types: logical >hierarchy (one concept-one hierarchy) and perspective hierarchy (one >concept more than one hierarchy) first hierarchy type is good for broadening >and narrowing search in IR when vocabulary covers single subject area (e.g. >thesauri) second hierarchy type is paramount for disambiguation (when >vocabulary covers universal knowledge area e.g. classifications) I'm not sure whether any new distinction is being made here or whether she is just making the distinction between mono- and poly-hierarchy. I.e. whether or not a term can have more than one broader term. >the question is: if we accept to declare that something is a facet in >SKOS/OWL does this mean that only logical hierarchies are allowed... and >that the same concept will not occur in other hierarchies within the same >KOS irrespective whether the concepts are naturally context free and >irrespective the coverage of KOS (thinking here of polysems (culture, >organization, democracy) and other vague concepts such as water, marble, >cell etc. and the way they may be treated in special and general KOS) It seems to me that the only case where mono-hierarchies are required is something like a formal biological taxonomy, where membership of a parent concept is an essential part of the definition of a concept. "Whales" are "mammals" _by definition_ and they cannot therefore have any other parent concept such as "fishes" or "insects". In any other kind of hierarchy a term can in principle have more than one broader term, so that "whales" can be a narrower term of "mammals" as well as being a narrower term of "aquatic creatures", where it may have "fish" and "plankton" as sibling terms. This polyhierarchical structure allows broadening of searches for "all mammals" or "all aquatic creatures", so I'm puzzled by the suggestion you quote above that a monohierarchy is desirable for this purpose. The only restriction is that the parent concepts must belong to the same fundamental category (which I call a facet). "Whales" is in the facet of "organisms" or "living things" and cannot have a parent concept in the facet of "disciplines", or "actions", or "places", for example. Some thesauri are restricted to being mono-hierarchical because of limitations in the software used to construct them, but that is not something that we should accept as a general principle. I don't think that the issue of polysemes is relevant here, because we are talking about the relationship between _concepts_ rather than words. If a word can represent more than one concept within a controlled vocabulary then it is not a good descriptor and needs to be qualified to show which concept it represents. If it represents only one concept within the vocabulary, though it can represent other concepts elsewhere, then its scope note needs to show clearly that its meaning is restricted. >2) facets in expressing syntax/structure > >There is no agreement on the semantic of fundamental facets so pinning >down the semantic can hardly be the ONLY reason for stating the facet. >Thesauri usually declare facets for vocabulary building/control/management >while classification systems, apart from this, exploit facets also for >precision in indexing (i.e. building complex expressions). Hence, the first >ones have only facets and the second one have both facets and roles >attached to them Yes, I think that this is the core of the problem. The rules for combining descriptors to create a compound string to represent a combination of concepts, are often called rules for the "citation order of facets", but as I said in my last message this meaning of "facet" is different from the "fundamental category" meaning. We are talking about two different concepts, and I think we should give them different names. When we build a string using a rule such as >> >Thing/kind/part/property/material/process/operation/patient/product/by- >> >product/agent/space/time I would say that "we are combining concepts (or the terms which represent concepts) according to their roles", with no mention of facets. In the strings boys kissing girls and girls kissing boys "boys" and "girls" both belong to the same facet of "people". The citation order is determined by roles and not by facets. >Outside traditional KOS and in the spectrum of different so called 'faceted' >vocabularies created to support browsing on portals the reasons for >encoding facets is the same. These vocabularies do not attach any >'fundamental' meaning to the facets and yet they exploit them to achieve >certain functionality in managing terminology and creating >browsing/searching interface Yes, this is another meaning again. When an interface allows you to search for wine first by origin, then by colour, then by sweetness, it is allowing you to apply successive characteristics of division in order to reduce the number of entries in the arrays at the lowest level. This is a fundamental feature of "faceted classification", but neither the "characteristics of division" (origin, colour, sweetness) nor the resulting arrays are "facets" in the sense of "fundamental categories", and I think it misleading to call them that. >In order to have roles one has to have data structure to which to >attach these roles (and later on the rules for processing the roles). We don't need a structure other than well-defined concepts to attach roles to. The fact that boys and girls are in the same facet in the example about doesn't help in determining their roles. >But the very fact that classification facets have their roles I don't believe that they do, unless you are defining "classification facets" to _mean_ roles. Doing that seems to introduce unnecessary confusion.. > is *exactly* the reason why I would want to encode them for machine >processing: I need to handle and automate syntax. For processing pre- >coordinate vocabularies it is very important to know that one concept belong >to a certain facet as this context will determine its place in a string, its role Its role, not the facet to which it belongs, will determine its place in a string. > and its meaning in this particular facet as opposed to its meaning when it >occurs in some other facet... Meaning of concepts should be defined by scope notes. A single concept should not occur in more than one facet, though it may occur in more than one hierarchy within a single facet (see above). >My understanding is that thesauri may as well 'pretend' that facets are >fundamental categories of mutually exclusive terms and fix each term to >occur only in one facet Thesauri have less need for disambiguation - >because they are zooming down on the narrow subject area where one >concept has only one broader concept and often only one role. Such is the >case of materials in AAT... where stone or glass or leather is not discussed >outside their role in the Art and Architecture. > >a) AAT, for instance, does not have to accommodate 'marble deposits in >geology' where the same concept may not be treated as 'material' . I don't see the problem here. The concept of "marble" refers to "a granular crystalline limestone", and this is always true (_pace_ any pedantic geologists). It may be put into an array under the node label <rock by composition> and into another array under the broader term "materials for sculpture" (if that is its only use within the scope of the thesaurus). It may also be combined into an indexing string with the discipline of "geology" and the form "deposits". None of these affects the nature of the concept or its membership of a "materials" facet. >b) thesauri do not need to use facets to exercise the roles as they are used >for single term indexing (post-coordinate indexing) They don't combine >terms together in a complex expression. [having said that: if one chose to >produce composite terms with thesaurus, one would need to attach role to >the facets_ > >Any analytico-synthetic classifications and other pre-coordinated indexing >langauges have to exploit facet analysis for more than one purpose. This >does not mean that facet in classification (Processes or Materials, Place) in >the context of a given discipline are not classes in which essential properties >are exhibited by all its members. > >(We can, for the purpose of this discussion, think of classification such as >Bliss 2 to be a collection of thesauri for instance) > Yes, I am becoming more and more convinced that thesauri and classification schemes are just alternative ways or arranging and presenting lists and groups of concepts. I therefore am very keen to help arrive at a single set of unambiguous terms which we can use to discuss these things, rather than having to qualify statements by saying that we are talking "in a thesaurus context" or "in a classification context". This is an interesting discussion - I wonder whether other people have views on whether what we are saying makes sense. Are we making any progress towards a consensus of opinion? Leonard -- Willpower Information (Partners: Dr Leonard D Will, Sheena E Will) Information Management Consultants Tel: +44 (0)20 8372 0092 27 Calshot Way, Enfield, Middlesex EN2 7BQ, UK. Fax: +44 (0)870 051 7276 L.Will@Willpowerinfo.co.uk Sheena.Will@Willpowerinfo.co.uk ---------------- <URL:http://www.willpowerinfo.co.uk/> -----------------
Received on Monday, 1 March 2004 12:29:59 UTC