Re: Differentially Expressed Gene Lists

Hi Misha and All,

I believe there is a complementary relationship between paper-based gene 
lists and the gene lists produced by Atlas. Different methods may 
produce different lists of genes (there should be some overlap though) 
for the same experiment. It's good to consider the possibility of 
integrating GeneSigDB and Atlas. I also believe SDA is complementary to 
the paper-based approach and database approach. It is a 
practice/framework recommended for authors/editors to enter things like 
gene lists in a machine readable format. FEBS Letters (published by 
Elsevier) has experimented with SDA. That's why I wonder if a similar 
experiment can be done for gene lists.

Cheers

-Kei

 Misha  Kapushesky wrote:
> Hi,
>
> Just to add to what Michael said, I will be in Boston around Christmas
> and one of the goals of this visit is to talk to the GeneSigDB guys
> about integrating it somehow with our Atlas, as gene list/signature
> curation is something a lot of our users demand frequently.
>
> Cheers,
>
> Misha
>
> On Mon, Nov 23, 2009 at 9:08 PM, Kei Cheung <kei.cheung@yale.edu> wrote:
>   
>> Since there seems to be an interest in exploring how to represent
>> differentially expressed gene list, I'm cc'ing this to the HCLS list (Scott
>> also suggested doing this). In addition to Michael's suggestion (see below),
>> the following is a reference to Structured Digital Abstract (SDA) for those
>> who are interested.
>>
>> http://www.nature.com/nature/journal/v447/n7141/full/447142a.html
>>
>> Cheers,
>>
>> -Kei
>>
>> mdmiller wrote:
>>     
>>> hi all,
>>>
>>> SDA does sound useful but it still has the hurdle that the writers of the
>>> abstract need to be aware and care to use such structures or someone else
>>> needs to add such structure afterwards.
>>>
>>> as i mentioned at the F2F, there is GeneSigDB [1] hosted at dana farber,
>>> which is curating gene lists from papers.  this is very useful in that it
>>> pulls out into a structured database the genes of interest from papers but
>>> has the limitation that it doesn't provide the signatures of interest.
>>>
>>> there is also the matter of the algorithm that, given a new gene
>>> expression profile, provides a score as to inclusion or exclusion from the
>>> biomarker. i'm not a bioinformaticists but my understanding is such
>>> algorithms may depend on more than just the signatures, they might depend on
>>> the values of some higher level algorithm, i.e. there may not be one size
>>> fits all algorithms to determine membership.
>>>
>>> i also won't be able to make the call.
>>>
>>> cheers,
>>> michael
>>>
>>> [1]: http://compbio.dfci.harvard.edu/genesigdb/
>>>
>>> Michael Miller
>>> mdmiller53@comcast.net
>>>
>>> ----- Original Message ----- From: "Helen Parkinson" <parkinson@ebi.ac.uk>
>>> To: "Kei Cheung" <kei.cheung@yale.edu>
>>> Cc: <marshall@science.uva.nl>; "mdmiller" <mdmiller53@comcast.net>; "Misha
>>> Kapushesky" <ostolop@ebi.ac.uk>; "Eric Prud'hommeaux" <eric@w3.org>;
>>> "Matthias Samwald" <samwald@gmx.at>
>>> Sent: Monday, November 23, 2009 5:58 AM
>>> Subject: Re: Differentially Expressed Gene Lists
>>>
>>>
>>>       
>>>> I should have read the spec for the SDA. Sounds useful.
>>>>
>>>> Kei Cheung wrote:
>>>>         
>>>>> Helen Parkinson wrote:
>>>>>
>>>>>           
>>>>>>> I've been wondering how structured digital abstract (SDA) can be
>>>>>>> applied to gene lists. There has been a pilot application of SDA by FEBS in
>>>>>>> terms of interactions with links to MINT. Perhaps, a similar thing can be
>>>>>>> done for gene lists with links to databases like ArrayExpress and GEO. Mark
>>>>>>> Gerstein and I are working on a paper describing some extension of SDA
>>>>>>> (Matthias is also a co-author).
>>>>>>>
>>>>>>>               
>>>>>> Gene lists don't really appear in abstracts either often, one or two
>>>>>> genes of interest are mentioned. I'd be happy to take a look at a subset of
>>>>>> papers though to confirm/deny this. If that would help
>>>>>>             
>>>>> Many people misunderstood that structured digital abstract applies only
>>>>> to abstracts (I know the name is misleading unforunately), it can be applied
>>>>> to representing key findings described in the paper. The abstract I'm
>>>>> looking at is the following:
>>>>>
>>>>> http://www.ncbi.nlm.nih.gov/pubmed/16242812
>>>>>
>>>>> It mentions explicitly that one of the key findings is a differently
>>>>> gene list of 225 genes across different conditions. Four of the validated
>>>>> genes are also mentioned in the abstract:
>>>>>
>>>>> apolipoprotein J, interleukin-1 receptor-associated kinase 1, tissue
>>>>> inhibitor of metalloproteinase 3, and casein kinase 2, beta.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>           
>>>>>>>> The job of creating an OWL/RDF representation of microarray
>>>>>>>> experiement results should *start* with differentially expressed genes and
>>>>>>>> experimental conditions. I would hope that it doesn't start with text mining
>>>>>>>> ill-defined text sources (such as figures). On the bright side, if the
>>>>>>>> experiment of interest is included in Gene Expression Atlas, we are all set,
>>>>>>>> right? We just need a way to export the RDF.
>>>>>>>>                 
>>>>>>>
>>>>>>> I'm planning  to give a short presentation including an example during
>>>>>>> the BioRDF call tomorrow.
>>>>>>>               
>>>>>> I'll miss this unfortunately, as will Misha. Yes, if the experiment is
>>>>>> in Atlas we can now export rdf and we have a prototype working.
>>>>>>             
>>>>> Look forward to seeing the protoype. SDA can be seen as a bridge between
>>>>> databases and literature. It fits both bottom-up and top-down approach.
>>>>>
>>>>>           
>>>>>>>> BTW, I've pasted Kei's reminder below for the next BioRDF call
>>>>>>>> (tomorrow). It would be great if you could call in and help us to figure
>>>>>>>> this out.
>>>>>>>>                 
>>>>>> Apologies, I have 30 days holiday to use before the end of the year, so
>>>>>> I won't make this.
>>>>>>             
>>>>> Have a good vacation.
>>>>>
>>>>> -Kei
>>>>>
>>>>>           
>>>>>>>> Cheers,
>>>>>>>> Scott
>>>>>>>>
>>>>>>>> n.b. Does anybody mind if I post this to the mailing list?
>>>>>>>>                 
>>>>>>> It's fine with me if people think nothing is controversal or premature
>>>>>>> here. :-)
>>>>>>>
>>>>>>> Cheers,
>>>>>>>
>>>>>>> -Kei
>>>>>>>
>>>>>>>               
>>>>>>>> -------------------------------------------------------------------------
>>>>>>>>
>>>>>>>>
>>>>>>>> This is a reminder that the next BioRDF telcon call will be held at
>>>>>>>> 11 am EDT (5 pm CET) on Monday, November 23 (see details below).
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>>
>>>>>>>> -Kei
>>>>>>>>
>>>>>>>> == Conference Details ==
>>>>>>>> * Date of Call: Monday November 23, 2009
>>>>>>>> * Time of Call: 11:00 am Eastern Time
>>>>>>>> * Dial-In #: +1.617.761.6200 (Cambridge, MA)
>>>>>>>> * Dial-In #: +33.4.89.06.34.99 (Nice, France)
>>>>>>>> * Dial-In #: +44.117.370.6152 (Bristol, UK)
>>>>>>>> * Participant Access Code: 4257 ("HCLS")
>>>>>>>> * IRC Channel: irc.w3.org port 6665 channel #HCLS (see W3C IRC page
>>>>>>>> for details, or see Web IRC), Quick Start: Use
>>>>>>>> http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls for IRC
>>>>>>>> access.
>>>>>>>> * Duration: ~1 hour
>>>>>>>> * Frequency: bi-weekly
>>>>>>>> * Convener: Kei Cheung
>>>>>>>> * Scribe: to-be-determined
>>>>>>>>
>>>>>>>> == Agenda ==
>>>>>>>> * Roll call & introduction (Kei)
>>>>>>>> * RDF representation of microarray experiment and data (All)
>>>>>>>> * Provenance and workflow (All)
>>>>>>>>
>>>>>>>>                 
>>>>>>>>> mdmiller wrote:
>>>>>>>>>
>>>>>>>>>                   
>>>>>>>>>> hi scott,
>>>>>>>>>>
>>>>>>>>>> typically it will not, the data will be the raw data, usually in
>>>>>>>>>> the format from the feature extractor at the feature/spot level and then the
>>>>>>>>>> error corrected and normalized data at the sequence/gene/transcript level.
>>>>>>>>>>  one could include the final gene list in the MAGE-TAB by adding the
>>>>>>>>>> appropriate columns to the MAGE-TAB and have the Derived Data File column
>>>>>>>>>> contain the name of the appropriate file but in general the gene list is
>>>>>>>>>> only in the paper, often as supplemental data.  in fact i can't think of a
>>>>>>>>>> single case where it is included, helen, can you?
>>>>>>>>>>
>>>>>>>>>> cheers,
>>>>>>>>>> michael
>>>>>>>>>>
>>>>>>>>>> ----- Original Message ----- From: "M. Scott Marshall"
>>>>>>>>>> <marshall@science.uva.nl>
>>>>>>>>>> To: <mdmiller53@comcast.net>
>>>>>>>>>> Cc: "kc28" <kei.cheung@yale.edu>
>>>>>>>>>> Sent: Friday, November 13, 2009 6:50 PM
>>>>>>>>>> Subject: BioRDF
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>                     
>>>>>>>>>>> Hi Michael,
>>>>>>>>>>>
>>>>>>>>>>> One of the things that has come up during the BioRDF call is "we
>>>>>>>>>>> need to confirm whether MAGE-TAB contains gene lists or not". In other
>>>>>>>>>>> words, we will need to represent gene lists in RDF.
>>>>>>>>>>>
>>>>>>>>>>> Ok, my quick interpretation between other things,
>>>>>>>>>>> Scott
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> M. Scott Marshall
>>>>>>>>>>> Leiden University Medical Center / University of Amsterdam
>>>>>>>>>>> http://staff.science.uva.nl/~marshall
>>>>>>>>>>>                       
>>>>>>>>>>                     
>>>>>>>>                 
>>>> --
>>>> Helen Parkinson, PhD
>>>> ArrayExpress Production Coordinator,
>>>> Microarray Informatics Team, EBI
>>>>
>>>> EBI 01223 494672
>>>> Skype: helen.parkinson.ebi
>>>>
>>>>
>>>>         
>>>       
>>
>>     
>
>   

Received on Tuesday, 24 November 2009 20:17:31 UTC