W3C home > Mailing lists > Public > public-esw-thes@w3.org > October 2005

Re: notes at contepts vs notes at terms

From: Sue Ellen Wright <sellenwright@gmail.com>
Date: Wed, 19 Oct 2005 13:38:16 -0400
Message-ID: <e35499310510191038x2ed35909l5b2d25fb1f1ad9c9@mail.gmail.com>
To: "Miles, AJ (Alistair)" <A.J.Miles@rl.ac.uk>, Gail Hodge <Gailhodge@aol.com>
Cc: Mark van Assem <mark@cs.vu.nl>, public-esw-thes@w3.org
Hi, All,
I hope I'm catching everybody--I'm sort of carrying on the same conversation
in a couple different threads. The difficulty with defining "term" arises
from the fact that a term in a thesaurus and a term in a terminological
collection are not the same thing. In terminology management, a *term* is "a
verbal designation of a general concept in a specific subject field." In
practice, there can be a number of (sometimes many) terms associated with a
given concept. In terminology management, a preferred term is one of these
designations that has been selected as the most common or correct for use in
a given environment. There may be multiple preferred terms for the same
concept, for instance in medicine, where different terms are preferred for
different registers (scientists, medical health care professionals, educated
middle class clients vs. illiterate dialect speakers, etc.). The important
thing is all the terms are indeed true or nearly true synonyms used in real
discourse, written or spoken.
 Remember that a thesaurus (or other controlled vocabulary) is designed to
provide us with the -- let's say preferred string, to avoid using the word
"term" over again -- that we're going to attach to an object or the
representation of a object in a collection or data collection. A
non-preferred term in this sense is any other word or string that people
maybe associate with this preferred string will be mapped to the preferred
string for information retrieval purposes. So, for instance, if I want to
search for *deoxyribo nucleic acid *I am probably going to find it under the
preferred term *DNA*.This particular example works just fine for both
thesaurus and terminology management because the two terms are both
representations of a single concept. But many thesauri are designed to
streamline the search structures, so sometimes they are structured so that
the preferred term actually represents a broader concept, say *use "rock" *for
*granite, feldspar, shale, etc.* This wouldn't be very useful in a
geological database, but in a general language system without too much
differentiated information, it might work very well. So here the preferred
term is *rock*, and the non-preferred terms all represent its children. *
Stone* might also be a non-preferred term in the same system, but in terms
of concept modeling it resides on a different level, together with *rock* as
a synonym. In a terminological entry, stone and rock might appear together
as equal terms, and we might preference one of the other, but the specific
materials would each reside in a different entry. They are all terms, but
the relationship between them is very different. This is why a
terminological concept system can look very different from a thesaurus.
 All this underscores the problem with citing WordNet as the exemplar here.
This is not to say that WordNet isn't great, good and interesting, but it
represents a marriage of several kinds of ordering, so it's a little
difficult to describe clear differentiations based on WordNet structures.
 Does that help -- or only muddle the issues?
 Bye for now
Sue Ellen
  On 10/19/05, Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk> wrote:
> Hi Mark,
> > From one point of view ("maintenance", "future extensions" or
> > whatever you might call it) the class approach has the advantage that
> > you can always attach properties to terms, e.g. properties that might
> > turn out to be really useful somewhere in the future (i.e. stuff we
> > cannot anticipate now).
> >
> > Another reason is that Terms get a URI so that they can be referred
> > to. In the WordNet TF, this is a motivation to assign URIs to
> > WordSenses, instead of using blank nodes. You can then use WordSenses
> > e.g. to annotate texts. Similar uses might be envisioned for
> > SKOS terms.
> The thing is, I don't think that a class of 'non-preferred terms' in the
> thesaurus sense would correspond to the class of wordnet WordSenses. The
> wordnet metamodel (is [1] the latest version?) has three main classes:
> 'Word' 'WordSense' and 'Synset'. I think the class wn:Word (which is a
> super-class of wn:Collocation) is closest to the notion of a 'non-preferred
> term', but even that I don't think matches, because a non-preferred term is
> always embedded in a thesaurus, and hence represents a relationship between
> several entities, whereas a Word is kind of an entity in its own right ...
> See how fuzzy things get when we try to work out what a 'term' is?
> There are other alternatives to defining a class of non-preferred terms,
> such as e.g.
> eg:foo a skos:Concept;
> skos:prefLabel 'Foo';
> skos:altLabel 'Bar';
> skos:note [
> rdf:value 'Blah blah.';
> skos:onLabel 'Foo';
> ];
> .
> Cheers for now,
> Al.
> [1] http://www.cs.vu.nl/~mark/wn/17-10-05/wn.rdfs

Sue Ellen Wright
Institute for Applied Linguistics
Kent State University
Kent OH 44242 USA
Received on Wednesday, 19 October 2005 17:38:31 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:54 GMT