W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > February 2008

Re: BioRDF Brainstorming

From: Matthias Samwald <samwald@gmx.at>
Date: Tue, 12 Feb 2008 22:53:18 +0100
Message-ID: <015DBF3C0760495797E37758B969498A@tessellate>
To: "Matt Williams" <matthew.williams@cancer.org.uk>, "Peter Ansell" <ansell.peter@gmail.com>
Cc: <public-semweb-lifesci@w3.org>, <holger.stenzhorn@deri.org>, <p.roe@qut.edu.au>, <j.hogan@qut.edu.au>

> I'd agree that to capture all the publication might be hard, but to only 
> capture this bit (I suspect the conclusion) wouldn't you need to find the 
> conclusion, and ignore the rest? Using the abstract only might help, but 
> not enough....in any case, there are other bits (e.g. which type of 
> bananas they used) that you might well want to capture.

Automatically capturing only a very selective part of a biomedical article 
(the main results/conclusions) might be even harder than capturing 
everything in an un-selective manner. This is why I am more interested in 
elegant hybrid systems, where NLP aids human annotators during the 
annotation process, instead of doing everything from start to finish 

Humans are much better at judging relevance, especially when they are 
annotating their own text/data during submission to a journal or database. 
With a good annotation system that makes use of NLP, existing ontologies and 
auto-suggest features, I would estimate that the main facts of most 
biomedical articles can be annotated in a very rich manner in less than 5-6 
minutes --  not only mentioning entities that appear in the text, but 
actually formulating new facts and relations. If we compare that with the 
time authors need to spend on various other aspects of article submission, 
such as formatting the layout, checking citations etc., it seems realistic 
that authors would be able and willing to create such annotations... 
especially when we can demonstrate that it increases the usefulness and the 
impact of the article/database entry significantly.

Matthias Samwald
Received on Tuesday, 12 February 2008 21:53:49 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 17:20:33 UTC