Re: Size estimates of current LS space (and Introductions) from Joanne Luciano on 2006-08-04 (public-semweb-lifesci@w3.org from August 2006)

From: Joanne Luciano <jluciano@predmed.com>
Date: Fri, 4 Aug 2006 06:30:51 -0400
To: Nigam Shah <nigam@stanford.edu>
Cc: Joanne Luciano <jluciano@predmed.com>, "'Jeremy Zucker'" <zucker@research.dfci.harvard.edu>, "'Skinner, Karen \(\(NIH/NIDA\)\) [E]'" <kskinner@nida.nih.gov>, "'Eric Neumann'" <eneumann@teranode.com>, "'public-semweb-lifesci hcls'" <public-semweb-lifesci@w3.org>
Message-Id: <75DE9A5A-3835-46DB-AA4E-25855982635A@predmed.com>

> I would like to get some feedback on the feasibility of distributed
> curation.

Me too!

> PIs who have years of experience in managing curation
> projects are not that enthusiastic about its role. It seems the CS
> community is all for it but the actual *users* havent really bought
> in.

Are there any lurkers out there who can comment on why?
If not, do others think this would be something that would be worth  
seeking out
to understand why?  Have the folks at CBioC/ASU any idea why it's not
taking off?  Does the word need to be spread or is there something wrong
with the tool? The concept? Trust of the data?

> For example, there is a great tool developed by Chitta Baral's
> group at ASU called CBioC
>
> http://cbioc.eas.asu.edu/
>
> When you search the PubMed database and display a particular abstract,
> CBioC will automatically display the interactions found in the CBioC
> database related to the abstract you are viewing. If the abstract has
> not been processed by CBioC before, the automatic extraction system
> will run "on the fly".
>
> CBioC runs as a web browser extension, not as a stand-alone
> application. When you visit the Entrez (PubMed) web site, CBioC
> automatically opens within a "web band" at the bottom of the main
> browser window in either IE or Firefox.
>
> To me it appears to be a great tool, something that can actually
> exploit the wisdom of the crowds without much effort required (other
> than saying yes/no to an automatically extracted interaction).
>
> What do others on the list think about such projects?

I agree - and think it would be good to learn more about why it's not
taking off.

>> Third, with semantic web technologies such as description
>> logics and rules, it should be possible to infer when two data
>> sets are really talking about the same biological object, even
>> if they use different identifiers to describe the thing.
>> To that end, I have been working with Alan Ruttenberg and
>> others at York University, UCSD and SRI to develop an
>> OWL/Description-logic based method to automate the integration
>> of two E. coli databases.

And Manchester :-)

> I think with SW technologies it should be possible to go beyond
> integration. At Stanford, we did a test project for integrating
> ecocyc, reactome and kegg using BioPAX to create the Pathway Knowledge
> Base, PKB at http://pkb.stanford.edu [its currently down coz we are
> moving machines].

Can you say more about the problems you ran into and how you resolved  
them?
The kind of integration Jeremy is talking about is integration based on
having descriptions of reactions, for example, such that a reasoner can
infer from the description that two reactions are the same - even if
they have different identifiers, or if in the database the left and  
right side
are reversed.

Moreover, the reasoners can be used to find inconsistencies within  
(and across)
the databases.  Work explored by Jeremy and Alan Ruttenberg.

Joanne

Received on Friday, 4 August 2006 10:50:56 UTC