Re: Seeking Help with finding an assertion

As a follow-up example, a study for estimating the error rate of Gene 
Ontology (GO) was done:

http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=1892569#id2674403

The study showed that the GO term annotation error rate estimates for 
the GoSeqLite database were found to be 13% to 18% for curated non-ISS 
annotations, 49% for ISS annotations, and 28% to 30% for all curated 
annotations. (ISS stands for inferred from sequence similiarity). 
Despite these findings, the authors concluded that GO is a comparatively 
high quality source of informaton. Integration of databases involving 
significant error rates, however, can impact negatively the quality of 
science.

-Kei

Kei Cheung wrote:

>
> Hi Karen,
>
> Your questions remind me of the following classic article written by 
> Robert Robbins on "Challenges in the Human Genome Project".
>
> http://www.esp.org/umdnj.pdf
>
> Although it doesn't directly answer the questions, in the 
> "Nomenclature Problems" section (p. 20-21), it discusses the 
> significant problem of inconsistent knowledge representation. It says 
> that it's mistake to believe  that terminology fluidity is not an 
> issue biological in database design. It also says that many biologists 
> don't realize that, in a database bulit with 5% error in the 
> definition of individual concepts, a query that joins across 15 
> concepts has less than 50% chance of returning an adequate answer. The 
> section also points out the importance of formal representation of 
> scientific knowledge in addressing the inconsistency and nomenclature 
> problems. Semantic Web and standard ontologies provide a solution to 
> these database problems. We just don't simply convert an existing 
> database syntactically into a semantic web format, but we also need to 
> do careful semantic conversion to eliminate as many errors, 
> ambiguities, and inconsistencies as possible in order to reduce the 
> costs of knowledge retrieval and discovery.
>
> -Kei
>
> Skinner, Karen (NIH/NIDA) [E] wrote:
>
>> Recently I read somewhere (on this list, a blog, a news story, 
>> where...?) an assertion that struck me as an interesting passing fact 
>> at the time.   As I recall, it indicated that more websites are 
>> accessed via a search engine than by typing a URL into a browser web 
>> address bar.
>>
>> Alas, I did not save the reference, and now I am looking for the 
>> proverbial needle in a haystack. Namely, what is the exact assertion, 
>> who asserted it, and where did they make it?  If anyone in the world 
>> has this information or knows how to get it, or or has related data, 
>> I imagine they would belong to this list. I would be most grateful 
>> for any useful pointer.
>>
>> Along this same vein, if anyone has any statistics, data, anecodotes 
>> or information related to the cost of
>> (1) "friction" arising from inefficient or inappropriate efforts at 
>> information retrieval
>> and
>> (2) the cost of "negative knowledge" about an existing resource or data,
>>
>> these, too, would be helpful.
>>
>> (For example, with respect to #2 above, we are all familiar with 
>> comparison shopping for goods and services. We seek data/information 
>> about prices and quality , but at what point does the expenditure of 
>> that effort exceed the value of the information learned?)
>>
>> I am not looking for examples at the level of a philosophy or 
>> ecnomics Ph.D. thesis, but rather a few examples in the sciences that 
>> can be used at the level of an "elevator speech."
>>
>>
>> Karen Skinner
>> Deputy Director for Science and Technology Development
>> Division of Basic Neuroscience and Behavior Research
>> National Institute on Drug Abuse/NIH
>>
>>
>>
>>
>>
>>  
>>
>
>
>

Received on Thursday, 5 July 2007 03:27:49 UTC