W3C home > Mailing lists > Public > public-esw-thes@w3.org > October 2005

RE: notes at contepts vs notes at terms

From: Stella Dextre Clarke <sdclarke@lukehouse.demon.co.uk>
Date: Wed, 19 Oct 2005 19:14:50 +0100
To: "'Sue Ellen Wright'" <sellenwright@gmail.com>, "'Miles, AJ \(Alistair\)'" <A.J.Miles@rl.ac.uk>, "'Gail Hodge'" <Gailhodge@aol.com>
Cc: "'Mark van Assem'" <mark@cs.vu.nl>, <public-esw-thes@w3.org>
Message-ID: <005201c5d4d8$ff066200$0300a8c0@DELL>
Absolutely! Sue Ellen has hit the nail on the head again. Wordnet is
very nice, a lexical database that is useful for loads of linguistic and
literary purposes. But it is not a controlled vocabulary, and not
designed primarily for information retrieval. It is (correctly in my
view) built to a different model. Here I am back to my old hobby-horse:
it is dangerous to expect the same model to work for several different
applications. (That said, I do accept that it may be possible and useful
to have one "core" model with different "add-ons" for different
applications.)
 
When I get time, I'll try to reply to Alistair's request for examples of
note types that may apply to terms rather than to concepts.
 
Cheers
Stella
 
 

*****************************************************
Stella Dextre Clarke
Information Consultant
Luke House, West Hendred, Wantage, Oxon, OX12 8RR, UK
Tel: 01235-833-298
Fax: 01235-863-298
SDClarke@LukeHouse.demon.co.uk
*****************************************************



-----Original Message-----
From: public-esw-thes-request@w3.org
[mailto:public-esw-thes-request@w3.org] On Behalf Of Sue Ellen Wright
Sent: 19 October 2005 18:38
To: Miles, AJ (Alistair); Gail Hodge
Cc: Mark van Assem; public-esw-thes@w3.org
Subject: Re: notes at contepts vs notes at terms


Hi, All,
I hope I'm catching everybody--I'm sort of carrying on the same
conversation in a couple different threads. The difficulty with defining
"term" arises from the fact that a term in a thesaurus and a term in a
terminological collection are not the same thing. In terminology
management, a term is "a verbal designation of a general concept in a
specific subject field." In practice, there can be a number of
(sometimes many) terms associated with a given concept. In terminology
management, a preferred term is one of these designations that has been
selected as the most common or correct for use in a given environment.
There may be multiple preferred terms for the same concept, for instance
in medicine, where different terms are preferred for different registers
(scientists, medical health care professionals, educated middle class
clients vs. illiterate dialect speakers, etc.). The important thing is
all the terms are indeed true or nearly true synonyms used in real
discourse, written or spoken. 
 
Remember that a thesaurus (or other controlled vocabulary) is designed
to provide us with the -- let's say preferred string, to avoid using the
word "term" over again -- that we're going to attach to an object or the
representation of a object in a collection or data collection. A
non-preferred term in this sense is any other word or string that people
maybe associate with this preferred string will be mapped to the
preferred string for information retrieval purposes. So, for instance,
if I want to search for deoxyribo nucleic acid I am probably going to
find it under the preferred term DNA.This particular example works just
fine for both thesaurus and terminology management because the two terms
are both representations of a single concept. But many thesauri are
designed to streamline the search structures, so sometimes they are
structured so that the preferred term actually represents a broader
concept, say use "rock" for  granite, feldspar, shale, etc. This
wouldn't be very useful in a geological database, but in a general
language system without too much differentiated information, it might
work very well. So here the preferred term is rock, and the
non-preferred terms all represent its children. Stone might also be a
non-preferred term in the same system, but in terms of concept modeling
it resides on a different level, together with rock as a synonym. In a
terminological entry, stone and rock might appear together as equal
terms, and we might preference one of the other, but the specific
materials would each reside in a different entry. They are all terms,
but the relationship between them is very different. This is why a
terminological concept system can look very different from a thesaurus. 
 
All this underscores the problem with citing WordNet as the exemplar
here. This is not to say that WordNet isn't great, good and interesting,
but it represents a marriage of several kinds of ordering, so it's a
little difficult to describe clear differentiations based on WordNet
structures. 
 
Does that help -- or only muddle the issues?
 
Bye for now
Sue Ellen
 
 
On 10/19/05, Miles, AJ (Alistair) <A.J.Miles@rl.ac.uk> wrote: 


Hi Mark,

>  From one point of view ("maintenance", "future extensions" or 
> whatever you might call it) the class approach has the advantage that
> you can always attach properties to terms, e.g. properties that might
> turn out to be really useful somewhere in the future (i.e . stuff we
> cannot anticipate now).
>
> Another reason is that Terms get a URI so that they can be referred
> to. In the WordNet TF, this is a motivation to assign URIs to
> WordSenses, instead of using blank nodes. You can then use WordSenses 
> e.g. to annotate texts. Similar uses might be envisioned for
> SKOS terms.

The thing is, I don't think that a class of 'non-preferred terms' in the
thesaurus sense would correspond to the class of wordnet WordSenses.
The wordnet metamodel (is [1] the latest version?) has three main
classes: 'Word' 'WordSense' and 'Synset'.  I think the class wn:Word
(which is a super-class of wn:Collocation) is closest to the notion of a
'non-preferred term', but even that I don't think matches, because a
non-preferred term is always embedded in a thesaurus, and hence
represents a relationship between several entities, whereas a Word is
kind of an entity in its own right ... 

See how fuzzy things get when we try to work out what a 'term' is?

There are other alternatives to defining a class of non-preferred terms,
such as e.g.

eg:foo a skos:Concept;
skos:prefLabel 'Foo'; 
skos:altLabel 'Bar';
skos:note [
   rdf:value 'Blah blah.';
   skos:onLabel 'Foo';
];
.

Cheers for now,

Al.

[1] http://www.cs.vu.nl/~mark/wn/17-10-05/wn.rdfs
<http://www.cs.vu.nl/~mark/wn/17-10-05/wn.rdfs> 






-- 
Sue Ellen Wright
Institute for Applied Linguistics
Kent State University
Kent OH 44242 USA
sellenwright@gmail.com  <mailto:sellenwright@gmail.com> 
swright@kent.edu
sewright@neo.rr.com 
Received on Wednesday, 19 October 2005 18:15:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:38:54 GMT