- From: Bernard Vatant <bernard.vatant@mondeca.com>
- Date: Mon, 11 Oct 2004 11:04:02 +0200
- To: "Stella Dextre Clarke" <sdclarke@lukehouse.demon.co.uk>, <public-esw-thes@w3.org>
Stella > The difference between a deprecated concept and a deprecated term may > not be as clear as you might wish. Sure, no more than the difference between a term or a concept, deprecated or not :) > (And even the word "deprecated" is a > bit strange to me in the context of thesauri. We usually just say > non-preferred.) Indeed? We currently in Mondeca work for a major actor in legal publication (Wolters Kluwer Belgium), making a very intensive use of Thesauri, including e.g. their use for automatic generation of publication index. And one of the strong requirements of the folks in charge of Thesaurus management was indeed a proper handling of what they call "deprecated terms". A deprecated term is a term that used to be preferred, and used as such, and at some point of time in the history of the vocabulary was replaced by another preferred term. After "deprecation" (so to speak) the once-preferred, and now deprecated term, is kept as a synonym of the preferred term which replaces it. Whatever relationships (BT-NT, USE, ...) of the deprecated term are re-located to the replacing term, and indexation of documents is redirected. So of course a deprecated term is non-preferred, but it used to be, and the system keeps track of that if necessary. > It is unusual to drop a concept altogether. Of course the concept does not really change because the term is deprecated, since it's replaced, it's simply the preferred term for it that changes. Concepts never die :) > Normally one provides a lead-in entry pointing to the broader concept that covers the > scope of the preferred term that is now to be "deprecated". > It is conceivable that if it was decided that a large subject area with > perhaps hundreds of concepts was now out-of-scope, then all the > corresponding terms might be dropped without trace ( although this is > not usually recommended). The thesaurus might well be renamed or > rebranded to mark the transition. This is another story ... > Much more likely would be to decide that that subject area should be > indexed at a much shallower level of specificity. I think Thesaurus structure can (should) be kept independent from the indexing practices/applications that use the Thesaurus. See at the end the general remak about declarative vs procedural properties. Several different indexing applications can use the same Thesaurus at different levels of granularity, use or not use specific branches etc ... This is the notion of index profile (also a requirement of the above quoted customer). The index profile can be managed independently of the structure of the thesaurus itself. You can say e.g. in the profile that you only use the three first levels of the Thesaurus hierarchy, so whatever is indexed at a finer level of granularity will be re-indexed by the relevant parent. > So, for example, in a > thesaurus for agricultural products, it might be decided that tropical > products should no longer be covered in detail. Where previously you had > Bananas, Pineapples, Brazil nuts etc as preferred terms ( with a > hierarchy of BTs such as Tropical fruits all the way up to Tropical > products), you might leave just one term "Tropical products" to cover > all of these. In the thesaurus you would organise entries such as > "Bananas USE Tropical products" - perhaps hundreds of such entries. Now > where is the "deprecated concept"? All we have is one very broad concept > taking in tropical products at all levels of detail, and lots of > non-preferred terms. This is quite different from deprecation, it's changing the granularity of the Thesaurus. And in such a case, you could just change the indexing profile, saying now that "Tropical Products" is a "leaf term" for the indexing profile (meaning that everything below should be indexed on that term). > So the idea of a "deprecated concept" just feels a bit alien. Yes, there again, concepts never die. This is an important rule I've found out in topic map management : never delete a topic. Change its status, attributes, names, relationships, date of validity, but never delete. Once you have spoken about something at some point, this thing exists forever, at least as a subject of conversation :)) > I don't warm, either, to the idea of a concept getting "replaced" by > another one, unless they are so close that you would treat the two as > quasi-synonymous. You are hardly going to replace Bananas with Washing > machines? There again, only terms are replaced, not concepts. Bottom line : We need here to make distinct the *declarative* properties of concepts, valid whatever the context of application and the *procedural* properties, applicable only in specific contexts of use. For example seems to me that the BT-NT relationship between "Tropical Fruits" and "Bananas" should be declarative, and kept existing whatever the context, whereas the USE-UF relationship stated in order to use the Thesaurus at a broader level of granularity, is procedural: you know pretty well that "Tropical Fruits" and "Bananas" are distinct concepts, but in a certain context of application this distinction is useless for whatever reason. It's different from, say, you had an ancient astronomical thesaurus where "Evening Star" and "Morning Star" were thought as distinct concepts, and you decide/discover at some point that in fact they both have to be replaced by "Planet Venus". In this latter case, there is actually a (declarative) change in the conceptual scheme. It might be that the USE-UF relationship in Thesaurus is sometimes used in a declarative sense, and sometimes in a procedural sense, leading to some ambiguities. The above quoted notion of "index profile" allows to capture the procedural properties for various contexts, while not changing the declarative properties in the (common) Thesaurus. Regards Bernard Bernard Vatant Senior Consultant Knowledge Engineering Mondeca - www.mondeca.com bernard.vatant@mondeca.com
Received on Monday, 11 October 2004 09:05:37 UTC