- From: Nigam Shah <nigam@stanford.edu>
- Date: Thu, 3 Aug 2006 17:16:07 -0700
- To: "'Jeremy Zucker'" <zucker@research.dfci.harvard.edu>
- Cc: "'Skinner, Karen \(\(NIH/NIDA\)\) [E]'" <kskinner@nida.nih.gov>, "'Eric Neumann'" <eneumann@teranode.com>, "'public-semweb-lifesci hcls'" <public-semweb-lifesci@w3.org>
Hi Jeremy, Please see inline comments below.. > The semantic web interests me for several reasons. For one, I > believe it will be a solid substrate for distributed curation, > which is a necessary part of the ongoing effort to improve the > quality of the biological data we use. > Like wikipedia, we need a way to exploit the wisdom of crowds > to discover, cross-validate, and annotate the biological data > that we are currently using. I would like to get some feedback on the feasibility of distributed curation. PIs who have years of experience in managing curation projects are not that enthusiastic about its role. It seems the CS community is all for it but the actual *users* havent really bought in. For example, there is a great tool developed by Chitta Baral's group at ASU called CBioC http://cbioc.eas.asu.edu/ When you search the PubMed database and display a particular abstract, CBioC will automatically display the interactions found in the CBioC database related to the abstract you are viewing. If the abstract has not been processed by CBioC before, the automatic extraction system will run "on the fly". CBioC runs as a web browser extension, not as a stand-alone application. When you visit the Entrez (PubMed) web site, CBioC automatically opens within a "web band" at the bottom of the main browser window in either IE or Firefox. To me it appears to be a great tool, something that can actually exploit the wisdom of the crowds without much effort required (other than saying yes/no to an automatically extracted interaction). What do others on the list think about such projects? > Third, with semantic web technologies such as description > logics and rules, it should be possible to infer when two data > sets are really talking about the same biological object, even > if they use different identifiers to describe the thing. > To that end, I have been working with Alan Ruttenberg and > others at York University, UCSD and SRI to develop an > OWL/Description-logic based method to automate the integration > of two E. coli databases. I think with SW technologies it should be possible to go beyond integration. At Stanford, we did a test project for integrating ecocyc, reactome and kegg using BioPAX to create the Pathway Knowledge Base, PKB at http://pkb.stanford.edu [its currently down coz we are moving machines]. Some time back, we also "proofread" reactome (v.10 to v.14) to find four types of errors. More details at: http://www.biomedcentral.com/1471-2105/7/196 (A case study in pathway knowledgebase verification and http://www.hybrow.org/Reactome Now, putting these two projects togather, it is possible to see how SW technologies can be leveraged to for both, automated integration AND proofreading + may be more fancier analyses. It would be great if people on the HCLSIG provide comments/suggestions for such [possible] efforts. Regards, Nigam.
Received on Friday, 4 August 2006 00:16:45 UTC