W3C home > Mailing lists > Public > public-esw-thes@w3.org > October 2004

RE: candidate and deprecated concepts

From: Bernard Vatant <bernard.vatant@mondeca.com>
Date: Mon, 11 Oct 2004 11:04:02 +0200
To: "Stella Dextre Clarke" <sdclarke@lukehouse.demon.co.uk>, <public-esw-thes@w3.org>
Message-ID: <GOEIKOOAMJONEFCANOKCCECKFAAA.bernard.vatant@mondeca.com>


> The difference between a deprecated concept and a deprecated term may
> not be as clear as you might wish.

Sure, no more than the difference between a term or a concept, deprecated or not :)

> (And even the word "deprecated" is a
> bit strange to me in the context of thesauri. We usually just say
> non-preferred.)

Indeed? We currently in Mondeca work for a major actor in legal publication (Wolters
Kluwer Belgium), making a very intensive use of Thesauri, including e.g. their use for
automatic generation of publication index. And one of the strong requirements of the folks
in charge of Thesaurus management was indeed a proper handling of what they call
"deprecated terms". A deprecated term is a term that used to be preferred, and used as
such, and at some point of time in the history of the vocabulary was replaced by another
preferred term. After "deprecation" (so to speak) the once-preferred, and now deprecated
term, is kept as a synonym of the preferred term which replaces it. Whatever relationships
(BT-NT, USE, ...) of the deprecated term are re-located to the replacing term, and
indexation of documents is redirected.

So of course a deprecated term is non-preferred, but it used to be, and the system keeps
track of that if necessary.

> It is unusual to drop a concept altogether.

Of course the concept does not really change because the term is deprecated, since it's
replaced, it's simply the preferred term for it that changes. Concepts never die :)

> Normally one provides a lead-in entry pointing to the broader concept that covers the
> scope of the preferred term that is now to be "deprecated".
> It is conceivable that if it was decided that a large subject area with
> perhaps hundreds of concepts was now out-of-scope, then all the
> corresponding terms might be dropped without trace ( although this is
> not usually recommended). The thesaurus might well be renamed or
> rebranded to mark the transition.

This is another story ...

> Much more likely would be to decide that that subject area should be
> indexed at a much shallower level of specificity.

I think Thesaurus structure can (should) be kept independent from the indexing
practices/applications that use the Thesaurus. See at the end the general remak about
declarative vs procedural properties. Several different indexing applications can use the
same Thesaurus at different levels of granularity, use or not use specific branches etc
... This is the notion of index profile (also a requirement of the above quoted customer).
The index profile can be managed independently of the structure of the thesaurus itself.
You can say e.g. in the profile that you only use the three first levels of the Thesaurus
hierarchy, so whatever is indexed at a finer level of granularity will be re-indexed by
the relevant parent.

> So, for example, in a
> thesaurus for agricultural products, it might be decided that tropical
> products should no longer be covered in detail. Where previously you had
> Bananas, Pineapples, Brazil nuts etc as preferred terms ( with a
> hierarchy of BTs such as Tropical fruits all the way up to Tropical
> products), you might leave just one term "Tropical products" to cover
> all of these. In the thesaurus you would organise entries such as
> "Bananas USE Tropical products" - perhaps hundreds of such entries. Now
> where is the "deprecated concept"? All we have is one very broad concept
> taking in tropical products at all levels of detail, and lots of
> non-preferred terms.

This is quite different from deprecation, it's changing the granularity of the Thesaurus.
And in such a case, you could just change the indexing profile, saying now that "Tropical
Products" is a "leaf term" for the indexing profile (meaning that everything below should
be indexed on that term).

> So the idea of a "deprecated concept" just feels a bit alien.

Yes, there again, concepts never die. This is an important rule I've found out in topic
map management : never delete a topic. Change its status, attributes, names,
relationships, date of validity, but never delete. Once you have spoken about something at
some point, this thing exists forever, at least as a subject of conversation :))

> I don't warm, either, to the idea of a concept getting "replaced" by
> another one, unless they are so close that you would treat the two as
> quasi-synonymous. You are hardly going to replace Bananas with Washing
> machines?

There again, only terms are replaced, not concepts.

Bottom line : We need here to make distinct the *declarative* properties of concepts,
valid whatever the context of application and the *procedural* properties, applicable only
in specific contexts of use. For example seems to me that the BT-NT relationship between
"Tropical Fruits" and "Bananas" should be declarative, and kept existing whatever the
context, whereas the USE-UF relationship stated in order to use the Thesaurus at a broader
level of granularity, is procedural: you know pretty well that "Tropical Fruits" and
"Bananas" are distinct concepts, but in a certain context of application this distinction
is useless for whatever reason. It's different from, say, you had an ancient astronomical
thesaurus where "Evening Star" and "Morning Star" were thought as distinct concepts, and
you decide/discover at some point that in fact they both have to be replaced by "Planet
Venus". In this latter case, there is actually a (declarative) change in the conceptual
It might be that the USE-UF relationship in Thesaurus is sometimes used in a declarative
sense, and sometimes in a procedural sense, leading to some ambiguities.
The above quoted notion of "index profile" allows to capture the procedural properties for
various contexts, while not changing the declarative properties in the (common) Thesaurus.



Bernard Vatant
Senior Consultant
Knowledge Engineering
Mondeca - www.mondeca.com
Received on Monday, 11 October 2004 09:05:37 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:04 UTC