- From: Stella Dextre Clarke <stella@lukehouse.org>
- Date: Wed, 21 Oct 2009 20:31:17 +0100
- To: Thomas Bandholtz <thomas.bandholtz@innoq.com>
- CC: Antoine Isaac <aisaac@few.vu.nl>, SKOS <public-esw-thes@w3.org>
Thomas Bandholtz wrote: > Secondly, we need this stuff to support automated indexing of full text > documents. Machine need to be enabled to detect the Concepts behind this > weird mess of character strings that makes a document (more on this in > the ecoterm presentation). Another interesting point. I sometimes hear people complain that ISO2788-compliant thesauri do not help enough with retrieval from full text of documents that have not been humanly indexed. This is hardly surprising, since they were designed to support retrieval of documents indexed with that same vocabulary. The same is true of BS 8723-2 and the forthcoming ISO 25964-1. When people want to use a thesaurus for full text retrieval, I sometimes suggest they could improve the results by stripping the qualifiers off the non-preferred terms. But more could be done to enhance the results of that process, by including inflectional forms, term weighting, Boolean expressions, additional less reliable clue-words, etc, and of course dropping the idea of admitting the clue-words as non-preferred synonyms with reciprocal relationships. I sometimes wonder if a future revised version of BS 8723 or ISO 25964 should include some recommendations to this effect. What do you think? Stella ***************************************************** Stella Dextre Clarke Information Consultant Luke House, West Hendred, Wantage, OX12 8RR, UK Tel: 01235-833-298 Fax: 01235-863-298 stella@lukehouse.org *****************************************************
Received on Wednesday, 21 October 2009 19:31:47 UTC