W3C home > Mailing lists > Public > public-esw-thes@w3.org > October 2004

RE: candidate and deprecated concepts

From: Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk>
Date: Mon, 11 Oct 2004 14:16:39 +0100
Message-ID: <350DC7048372D31197F200902773DF4C05E50C98@exchange11.rl.ac.uk>
To: 'Bernard Vatant' <bernard.vatant@mondeca.com>, Stella Dextre Clarke <sdclarke@lukehouse.demon.co.uk>, public-esw-thes@w3.org

Bernard wrote:
> Yes, there again, concepts never die. This is an important 
> rule I've found out in topic
> map management : never delete a topic. Change its status, 
> attributes, names,
> relationships, date of validity, but never delete. Once you 
> have spoken about something at
> some point, this thing exists forever, at least as a subject 
> of conversation :))
> 

Yes, I think this is absolutely crucial as a recommendation.

N.B. this is why i tried to write about explicitly concept-oriented
indexing: because if the subject-based index is built to take *concept
identifers* and not terms as indexing values, then reshuffling which term is
the preferred label for a concept is a minor issue.  It doesn't affect
document metadata, and it can be adequately represented in RDF as a
structured history note attached to a *concept*.  (i.e. I see no need to
talk about 'deprecated terms').

However, once we do have explicitly concept-oriented subject indexes,
further change issues arise.  

The *requirement for stability* dictates that, once a concept has been
published, it should certainly never be deleted, and it should be altered as
little as possible, ideally not at all.  If this practise is observed, one
can to a large extent guarantee the appropriateness of indexing values over
time.  

However, the *requirement for fitness* dictates that a concept scheme must
be allowed to evolve to reflect changing patterns of usage, perspectives and
systems of thought.

I would like to be able to offer a recommendation as part of the SKOS Core
guide that balances the tension between these two requirements.  So far, the
best I can come up with is something like:

  'whenever there is pressure to significantly alter the meaning of a
concept or set of concepts, preference should be given to defining a new
concept or set of concepts, with new unique identifers, and expressing
mapping relationships between the old ('deprecated') concepts and the new
concepts.'

... with examples following illustrating how this information can be
represented as RDF.

Here, 'deprecated' means 'do not use this concept'.  An 'isReplacedBy'
statement means 'use this concept instead'.  A deprecated concept is allowed
to persist, to support the backwards compatibility of older subject indexes.


So I have the idea that it would be valuable to talk about 'deprecated'
*concepts* and I hope this email has at least gone part of the way to
explaining why.  I also have the idea that it would be valuable to talk
about 'candidate' concepts.

Incidentally, a design question that arises (if there is consensus that
making these kinds of statement is valuable) is how to represent the fact
that a concept might be candidate/deprecated in one scheme, but not in
another.

Al.



> -----Original Message-----
> From: public-esw-thes-request@w3.org
> [mailto:public-esw-thes-request@w3.org]On Behalf Of Bernard Vatant
> Sent: 11 October 2004 10:06
> To: Stella Dextre Clarke; public-esw-thes@w3.org
> Subject: RE: candidate and deprecated concepts
> 
> 
> 
> 
> Stella
> 
> > The difference between a deprecated concept and a 
> deprecated term may
> > not be as clear as you might wish.
> 
> Sure, no more than the difference between a term or a 
> concept, deprecated or not :)
> 
> > (And even the word "deprecated" is a
> > bit strange to me in the context of thesauri. We usually just say
> > non-preferred.)
> 
> Indeed? We currently in Mondeca work for a major actor in 
> legal publication (Wolters
> Kluwer Belgium), making a very intensive use of Thesauri, 
> including e.g. their use for
> automatic generation of publication index. And one of the 
> strong requirements of the folks
> in charge of Thesaurus management was indeed a proper 
> handling of what they call
> "deprecated terms". A deprecated term is a term that used to 
> be preferred, and used as
> such, and at some point of time in the history of the 
> vocabulary was replaced by another
> preferred term. After "deprecation" (so to speak) the 
> once-preferred, and now deprecated
> term, is kept as a synonym of the preferred term which 
> replaces it. Whatever relationships
> (BT-NT, USE, ...) of the deprecated term are re-located to 
> the replacing term, and
> indexation of documents is redirected.
> 
> So of course a deprecated term is non-preferred, but it used 
> to be, and the system keeps
> track of that if necessary.
> 
> > It is unusual to drop a concept altogether.
> 
> Of course the concept does not really change because the term 
> is deprecated, since it's
> replaced, it's simply the preferred term for it that changes. 
> Concepts never die :)
> 
> > Normally one provides a lead-in entry pointing to the 
> broader concept that covers the
> > scope of the preferred term that is now to be "deprecated".
> > It is conceivable that if it was decided that a large 
> subject area with
> > perhaps hundreds of concepts was now out-of-scope, then all the
> > corresponding terms might be dropped without trace ( 
> although this is
> > not usually recommended). The thesaurus might well be renamed or
> > rebranded to mark the transition.
> 
> This is another story ...
> 
> > Much more likely would be to decide that that subject area should be
> > indexed at a much shallower level of specificity.
> 
> I think Thesaurus structure can (should) be kept independent 
> from the indexing
> practices/applications that use the Thesaurus. See at the end 
> the general remak about
> declarative vs procedural properties. Several different 
> indexing applications can use the
> same Thesaurus at different levels of granularity, use or not 
> use specific branches etc
> ... This is the notion of index profile (also a requirement 
> of the above quoted customer).
> The index profile can be managed independently of the 
> structure of the thesaurus itself.
> You can say e.g. in the profile that you only use the three 
> first levels of the Thesaurus
> hierarchy, so whatever is indexed at a finer level of 
> granularity will be re-indexed by
> the relevant parent.
> 
> > So, for example, in a
> > thesaurus for agricultural products, it might be decided 
> that tropical
> > products should no longer be covered in detail. Where 
> previously you had
> > Bananas, Pineapples, Brazil nuts etc as preferred terms ( with a
> > hierarchy of BTs such as Tropical fruits all the way up to Tropical
> > products), you might leave just one term "Tropical 
> products" to cover
> > all of these. In the thesaurus you would organise entries such as
> > "Bananas USE Tropical products" - perhaps hundreds of such 
> entries. Now
> > where is the "deprecated concept"? All we have is one very 
> broad concept
> > taking in tropical products at all levels of detail, and lots of
> > non-preferred terms.
> 
> This is quite different from deprecation, it's changing the 
> granularity of the Thesaurus.
> And in such a case, you could just change the indexing 
> profile, saying now that "Tropical
> Products" is a "leaf term" for the indexing profile (meaning 
> that everything below should
> be indexed on that term).
> 
> > So the idea of a "deprecated concept" just feels a bit alien.
> 
> > I don't warm, either, to the idea of a concept getting "replaced" by
> > another one, unless they are so close that you would treat 
> the two as
> > quasi-synonymous. You are hardly going to replace Bananas 
> with Washing
> > machines?
> 
> There again, only terms are replaced, not concepts.
> 
> Bottom line : We need here to make distinct the *declarative* 
> properties of concepts,
> valid whatever the context of application and the 
> *procedural* properties, applicable only
> in specific contexts of use. For example seems to me that the 
> BT-NT relationship between
> "Tropical Fruits" and "Bananas" should be declarative, and 
> kept existing whatever the
> context, whereas the USE-UF relationship stated in order to 
> use the Thesaurus at a broader
> level of granularity, is procedural: you know pretty well 
> that "Tropical Fruits" and
> "Bananas" are distinct concepts, but in a certain context of 
> application this distinction
> is useless for whatever reason. It's different from, say, you 
> had an ancient astronomical
> thesaurus where "Evening Star" and "Morning Star" were 
> thought as distinct concepts, and
> you decide/discover at some point that in fact they both have 
> to be replaced by "Planet
> Venus". In this latter case, there is actually a 
> (declarative) change in the conceptual
> scheme.
> It might be that the USE-UF relationship in Thesaurus is 
> sometimes used in a declarative
> sense, and sometimes in a procedural sense, leading to some 
> ambiguities.
> The above quoted notion of "index profile" allows to capture 
> the procedural properties for
> various contexts, while not changing the declarative 
> properties in the (common) Thesaurus.
> 
> Regards
> 
> Bernard
> 
> Bernard Vatant
> Senior Consultant
> Knowledge Engineering
> Mondeca - www.mondeca.com
> bernard.vatant@mondeca.com
> 
> 
> 
> 
Received on Monday, 11 October 2004 13:17:20 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 2 March 2016 13:32:04 UTC