W3C home > Mailing lists > Public > w3c-wai-gl@w3.org > October to December 2003

FW: From ACM TechNews 2003-12-08

From: Gregg Vanderheiden <gv@trace.wisc.edu>
Date: Tue, 9 Dec 2003 09:13:57 -0600
To: <w3c-wai-gl@w3.org>
Message-ID: <006d01c3be67$1114ca70$c517a8c0@USD320002X>

*  "Software Paraphrases Sentences"
Technology Research News (12/10/03); Patch, Kimberly 

Cornell University researchers are developing computer programs that can
automatically paraphrase sentences, a capability that could prove useful in
machine translation, technologies to help disabled people, and computer
processing of natural language. The technique borrows from computational
biology and was applied to online journalism. The researchers gathered
stories about an ongoing news topic covered by Reuters and Agence
France-Presse, in this case Middle East violence, and used that body of work
as source to paraphrase. MIT researcher Regina Barzilay, who recently worked
on the Cornell project, says sentence patterns were found, as well as key
facts and arguments; these basic elements are similar to the evolutionary
traces common between genes, and are deduced using the same techniques used
in computational biology. The software program also uncovered journalistic
bias as it sometimes paraphrased "suicide bomber" as "Palestinian suicide
bomber," and assumed people killed were Israelis. Barzilay says the
difficult part of developing the program lay in determining which variances
between reports are due to different subjects entirely and which are due to
paraphrasing. The paraphrase software is part of the Columbia News Blaster
project, which aims to automatically summarize news reported online without
human aid, and the next step is to paraphrase entire documents instead of
just sentences. Eventually, the software will be able to paraphrase language
in ways easily understandable to humans, and likewise understand what humans
write or say. The Cornell researchers' project was underwritten by the Sloan
Foundation and the National Science Foundation.

Received on Tuesday, 9 December 2003 10:20:17 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 16 January 2018 15:33:46 UTC