- From: Harvey Bingham <hbingham@acm.org>
- Date: Tue, 05 Oct 2004 09:36:41 -0400
- To: "WAI-EO" <w3c-wai-eo@w3.org>
Automatic Glossary Extraction: Beyond Terminology Identification Youngja Park, Roy J Byrd and Branimir K Boguraev IBM Thomas J. Watson Research Center {pyoungja, roybyrd, bkb}@us.ibm.com http://www.alphaworks.ibm.com/g/g.nsf/img/semanticsdocs/$file/glossaryext.pdf Has some interesting techniques. Abstract "This paper describes a method for automatically extracting domain-specific glossaries from large document collections. We show that, compared with current text analysis methods for extracting technical terminology form text, our extracted glossaries more successfully support applications requiring knowledge of domain concepts. After presenting our methods, we illustrate our output of GlossEx, our glossary extraction tool, and present an informal evaluation of its performance." Regards/Harvey
Received on Tuesday, 5 October 2004 22:53:54 UTC