Re: Size estimates of current LS space (and Introductions)

These are issues that the SWAN/Alzforum project team has been  
thinking about for a number of years and that have informed our  
approach to designing SWAN. We believe that in order for community  
curation to succeed, the user community (in our case,  
neurodegenerative disease researchers) have to trust the resource and  
be highly motivated to contribute to it. The Alzforum has achieved  
these goals by establishing high standards of scientific reporting  
and editing by its editorial team, and by developing relationships of  
trust with hundreds of scientists. We have recruited opinion leaders  
of diverse "ideological" stripes to our scientific advisory board,  
and their commentaries posted on the site have helped establish  
Alzforum as an influential forum. This has motivated a wider  
community to also provide commentaries. Resources developed by  
Alzforum, such as the AlzGene database, are viewed as definitive and  
are used to support discovery and publications in the field. Ten  
years after its founding, the Alzforum is cited as part of the  
scientific literature. Scientists are motivated to contribute content  
to the Alzforum because it has proved to be a highly visible venue in  
which to exert influence in the field and stake a claim to new ideas.

A very important ingredient in Alzforum's success is that all  
community postings are vetted by an expert and neutral editorial  
team. We scrub the emotional edges off of scientists' critiques and  
screen out inappropriate content, all with an eye to making sure that  
we deliver value to our readers. It's very easy to lose the readers'  
trust if they come across even a few postings that are a waste of  
their time. Could the scientific community perform this vetting and  
editing function itself, a la wikipedia? We have found that in  
general, this does not work. Many scientists refuse to correct or  
criticize colleagues directly in a public forum.

Our take-home message is that even the greatest curation tools will  
not be adopted by scientists unless they deliver immediate value to  
the scientist and are well-integrated into the knowledge ecosystem of  
scientific discourse.

June

On Aug 4, 2006, at 6:30 AM, Joanne Luciano wrote:

>
>> I would like to get some feedback on the feasibility of distributed
>> curation.
>
> Me too!
>
>> PIs who have years of experience in managing curation
>> projects are not that enthusiastic about its role. It seems the CS
>> community is all for it but the actual *users* havent really bought
>> in.
>
> Are there any lurkers out there who can comment on why?
> If not, do others think this would be something that would be worth  
> seeking out
> to understand why?  Have the folks at CBioC/ASU any idea why it's not
> taking off?  Does the word need to be spread or is there something  
> wrong
> with the tool? The concept? Trust of the data?
>
>> For example, there is a great tool developed by Chitta Baral's
>> group at ASU called CBioC
>>
>> http://cbioc.eas.asu.edu/
>>
>> When you search the PubMed database and display a particular  
>> abstract,
>> CBioC will automatically display the interactions found in the CBioC
>> database related to the abstract you are viewing. If the abstract has
>> not been processed by CBioC before, the automatic extraction system
>> will run "on the fly".
>>
>> CBioC runs as a web browser extension, not as a stand-alone
>> application. When you visit the Entrez (PubMed) web site, CBioC
>> automatically opens within a "web band" at the bottom of the main
>> browser window in either IE or Firefox.
>>
>> To me it appears to be a great tool, something that can actually
>> exploit the wisdom of the crowds without much effort required (other
>> than saying yes/no to an automatically extracted interaction).
>>
>> What do others on the list think about such projects?
>
> I agree - and think it would be good to learn more about why it's not
> taking off.
>
>>> Third, with semantic web technologies such as description
>>> logics and rules, it should be possible to infer when two data
>>> sets are really talking about the same biological object, even
>>> if they use different identifiers to describe the thing.
>>> To that end, I have been working with Alan Ruttenberg and
>>> others at York University, UCSD and SRI to develop an
>>> OWL/Description-logic based method to automate the integration
>>> of two E. coli databases.
>
> And Manchester :-)
>
>> I think with SW technologies it should be possible to go beyond
>> integration. At Stanford, we did a test project for integrating
>> ecocyc, reactome and kegg using BioPAX to create the Pathway  
>> Knowledge
>> Base, PKB at http://pkb.stanford.edu [its currently down coz we are
>> moving machines].
>
> Can you say more about the problems you ran into and how you  
> resolved them?
> The kind of integration Jeremy is talking about is integration  
> based on
> having descriptions of reactions, for example, such that a reasoner  
> can
> infer from the description that two reactions are the same - even if
> they have different identifiers, or if in the database the left and  
> right side
> are reversed.
>
> Moreover, the reasoners can be used to find inconsistencies within  
> (and across)
> the databases.  Work explored by Jeremy and Alan Ruttenberg.
>
> Joanne
>

Received on Thursday, 10 August 2006 20:39:55 UTC