W3C home > Mailing lists > Public > public-esw-thes@w3.org > August 2008

Re: thesaurus interchange language

From: Simon Spero <ses@unc.edu>
Date: Fri, 1 Aug 2008 20:38:54 -0400
Message-ID: <1af06bde0808011738k76686401jc480475470653b20@mail.gmail.com>
To: "Stephen Bounds" <km@bounds.net.au>
Cc: "Stella Dextre Clarke" <stella@lukehouse.org>, SKOS <public-esw-thes@w3.org>
Stella is wrong here, by which I mean Stella is absolutely right here.

LCSH is a thesaurus. It's just a very, very bad thesaurus. It meets the
minimal hurdle of having standardized relationship indicators.  The problem
is that most of the relationships  are wrong.    The relationship
designators were assigned algorithmically, with the idea that checking them
was not a high priority.   That was an understatement.

http://www.ibiblio.org/fred2.0/wordpress/?p=25

The results are messes like this:
http://www.ibiblio.org/fred2.0/wordpress/?p=28

Fixing   the  syndetic structure of LCSH, using a combination of automated
(NLP, machine learning, matching against gold standards), together with a
social software approach to soliciting  human guided judgements, is my
current dissertation focus.

Simon
p.s.

I absolutely believe that proper Semantic Factoring is necessary for a good
thesaurus, but  that's not enough to let LCSH off the hook.

The  classic article on the subject is Dykstra, Mary. "LC Subject Headings
Disguised as a Thesaurus". In: Library Journal 113.4 (1988). p42 .
ISSN: 03630277.  URL:
http://search.ebscohost.com/login.aspx?direct=true&db=aph&AN=6547855&site=ehost-live
.

A lot of Soergel's stuff is available online;
http://www.dsoergel.com/publication.htm  .

p.p.s.
http://www.dsoergel.com/cv/B62.html  .   Check the title...

On Fri, Aug 1, 2008 at 6:44 PM, Stephen Bounds <km@bounds.net.au> wrote:

>
> Hi Stella,
>
> Thanks for the info.  I know this is getting off the track, but can you
> explain the problems with LCSH?
>
> Cheers,
>
> -- Stephen.
>
> Stella Dextre Clarke wrote:
>
>>  From my perspective the beauty of SKOS is that it's flexible and
>>> extensible enough to allow a huge variety of uses.  A thesaurus just happens
>>> to be a really important use case because if SKOS becomes a lingua franca
>>> for thesauri (and particularly *big* thesauri like MDA, AAT, LCSH), then the
>>> SKOS standard will instantly get huge mindshare among Information Managers
>>> and Librarians.
>>>
>> LCSH a thesaurus? Not if you measure it against the rules in ISO 2788 and
>> BS 8723.
>>
>
>
>
Received on Saturday, 2 August 2008 00:39:32 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Monday, 7 December 2009 10:39:00 GMT