W3C home > Mailing lists > Public > public-semweb-lifesci@w3.org > August 2006

Re: Size estimates of current LS space (and Introductions)

From: Robert Stevens <robert.stevens@manchester.ac.uk>
Date: Fri, 04 Aug 2006 14:12:09 +0100
Message-Id: <>
To: Joanne Luciano <jluciano@predmed.com>,Nigam Shah <nigam@stanford.edu>
Cc: Joanne Luciano <jluciano@predmed.com>, "'Jeremy Zucker'" <zucker@research.dfci.harvard.edu>, "'Skinner, Karen \(\(NIH/NIDA\)\) [E]'" <kskinner@nida.nih.gov>, "'Eric Neumann'" <eneumann@teranode.com>, "'public-semweb-lifesci hcls'" <public-semweb-lifesci@w3.org>

I think there are two issues in a suspicion of distributed curation:

1. Wanting curation to be good before publication, rather than "it will 
become good eventually with enough people looking at it".
2. The perceived need to control what appears as the community view. 
Feedback from the comunity is fine, but it has to be filtered by the people 
in control.

I've seen many cases of mis-trust in, for instance, mapping from one DB to 
another. Many people would much rather do it themselves than trust someone 
else's mappings. In contrast, however, many of the public resources have a 
degree of trust-- I know what I'm getting when I use resource X. So, giving 
up trust is  a hard thing to do. SWISS- if people thought there was a lack 
of rigour.

With community curation, adding in the provenance so that I could, for 
insance, filter out idiots is all possible.

is trusted, but I think the trust would go

At 11:30 04/08/2006, Joanne Luciano wrote:

>>I would like to get some feedback on the feasibility of distributed
>Me too!
>>PIs who have years of experience in managing curation
>>projects are not that enthusiastic about its role. It seems the CS
>>community is all for it but the actual *users* havent really bought
>Are there any lurkers out there who can comment on why?
>If not, do others think this would be something that would be worth
>seeking out
>to understand why?  Have the folks at CBioC/ASU any idea why it's not
>taking off?  Does the word need to be spread or is there something wrong
>with the tool? The concept? Trust of the data?
>>For example, there is a great tool developed by Chitta Baral's
>>group at ASU called CBioC
>>When you search the PubMed database and display a particular abstract,
>>CBioC will automatically display the interactions found in the CBioC
>>database related to the abstract you are viewing. If the abstract has
>>not been processed by CBioC before, the automatic extraction system
>>will run "on the fly".
>>CBioC runs as a web browser extension, not as a stand-alone
>>application. When you visit the Entrez (PubMed) web site, CBioC
>>automatically opens within a "web band" at the bottom of the main
>>browser window in either IE or Firefox.
>>To me it appears to be a great tool, something that can actually
>>exploit the wisdom of the crowds without much effort required (other
>>than saying yes/no to an automatically extracted interaction).
>>What do others on the list think about such projects?
>I agree - and think it would be good to learn more about why it's not
>taking off.
>>>Third, with semantic web technologies such as description
>>>logics and rules, it should be possible to infer when two data
>>>sets are really talking about the same biological object, even
>>>if they use different identifiers to describe the thing.
>>>To that end, I have been working with Alan Ruttenberg and
>>>others at York University, UCSD and SRI to develop an
>>>OWL/Description-logic based method to automate the integration
>>>of two E. coli databases.
>And Manchester :-)
>>I think with SW technologies it should be possible to go beyond
>>integration. At Stanford, we did a test project for integrating
>>ecocyc, reactome and kegg using BioPAX to create the Pathway Knowledge
>>Base, PKB at http://pkb.stanford.edu [its currently down coz we are
>>moving machines].
>Can you say more about the problems you ran into and how you resolved
>The kind of integration Jeremy is talking about is integration based on
>having descriptions of reactions, for example, such that a reasoner can
>infer from the description that two reactions are the same - even if
>they have different identifiers, or if in the database the left and
>right side
>are reversed.
>Moreover, the reasoners can be used to find inconsistencies within
>(and across)
>the databases.  Work explored by Jeremy and Alan Ruttenberg.

Dr. Robert Stevens
Senior Lecturer
School of Computer Science
University of Manchester
Oxford Road
M13 9pL
Received on Friday, 4 August 2006 13:12:40 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 7 January 2015 14:52:27 UTC