- From: Michael Liebman <m.liebman@strategicmedicine.com>
- Date: Wed, 26 May 2010 14:19:40 -0400
- To: "'mdmiller'" <mdmiller53@comcast.net>, "'Kei Cheung'" <kei.cheung@yale.edu>
- Cc: "'HCLS'" <public-semweb-lifesci@w3.org>
I usually monitor this group and don't contribute but seeing the recent exchanges about Gene expression, I feel a need to put things into a better perspective than the one currently Being shared My experience comes from many years overseeing bioinformatics (and gene expression, proteomics And clinical data) at Wyeth, Roche, UPenn and with a DOD center at Windber- There appear to be several issues that are not being realistically addressed in the current discussion 1. there is significant experimental variability across individual studies, published or not- Because of variation in tissue/cell handling/storage/preparation, experimental variability in The experiment and significant variability in the data analysis. i.e. experimental reproducibility Inter-lab is poor and even intra-lab can be a major challenge 2. the measurement that is usually referred to as "up or down gene expression/regulation" refers to The comparison between 2 experiments (sample under 2 different conditions) but typically does not Adequately correct for individual experimental variability other than "simple" scaling. We have shown That this is inadequate. 3. leaving the interpretation to the author is significantly limited as it tends to reflect the bias of The author to "observe/confirm" what they are looking for in many of these studies- i.e. a biostatistician Will tell you that these experiments are extremely under-powered to reveal the true statistically significant Results they would like to achieve 4. human nature looks to favor the "big differences" as being most significant- unfortunately nature doesn't Work this way- many of the largest differences are not functionally relevant but reflect the fact that biological Control of these specific genes may not be critical to function and so large variability can be observed and should Not be interpreted, all of the time, as being most significant. In fact, we have developed analytical methods to Look at large libraries of gene expression studies and evaluate the overall stability/variability of individual Genes (and probes) to establish a significance in difference between states based on how much variation should be Expected vs how much is observed, especially in genes that show extremely small levels of expression overall and which Would not be considered by typical approaches to data analysis Sorry to interrupt the exchange but I believe that it is critical, when considering the development of systems to Represent, store, exchange, model data, that an understanding of the specifics and uniqueness of the underlying Data and analytical approaches must be considered beyond simple statistics. Michael Michael N. Liebman, PhD President/Managing Director Strategic Medicine, Inc 231 Deepdale Drive Kennett Square, PA 19348 (814) 659 5450 mobile m.liebman@strategicmedicine.com www.strategicmedicine.com -----Original Message----- From: public-semweb-lifesci-request@w3.org [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of mdmiller Sent: Wednesday, May 26, 2010 1:47 PM To: Kei Cheung Cc: HCLS Subject: Re: BioRDF Telcon hi kei, > Just want to clarify that what I meant was that it might be beyond the > scope of our use case to accurately, comprehensively, and precisely define > what gene expression really mean given the degree of complexity involved. exactly, i believe we can trust the authors of the gene expression papers and the journals themselves for this cheers, michael ----- Original Message ----- From: "Kei Cheung" <kei.cheung@yale.edu> To: "mdmiller" <mdmiller53@comcast.net> Cc: "HCLS" <public-semweb-lifesci@w3.org> Sent: Wednesday, May 26, 2010 7:23 AM Subject: Re: BioRDF Telcon > Hi Michael, > > mdmiller wrote: >> hi kei, >> >>> What do we mean by differentially expressed genes? One definition is >>> that differentially expressed genes are genes with significantly >>> different expression in two samples/conditions/experimental >>> factors/dimensions (e.g., treated vs. untreated, disease vs, normal, >>> time point1 vs. time point 2) of microarray experiments. >> >> yes, this was my meaning. >> >> this is to differentiate between a gene that is always expressed under >> normal conditions because it is part of an essential pathway that is >> always running, that gene is only interesting if its expression level >> changes--similarly for a normally unexpressed gene. > > Thanks for confirming. A consensus definition (even it's broad) is > important to our gene list representation. There are a variety of methods > (e.g., statistical tests) that can be used to identify a list of > differentially expressed genes in two different groups. That's Scott's > point about the importance of capturing as part of the genelist context > what methods have been used for detecting differentially expressed genes. > I hope the use case can help convince the community the need/use of a > common vocabulary for describing such methods. >> >>> How to measure or infer gene expression (e.g., from mRNA) is a whole >>> complex question that may be beyond the scope of our use case. >> >> yes, which i think was scott's point in his reply. in fact, for the >> BioRDF use case, initially at least, it is probably sufficient that the >> authors of the paper state that a gene is part of the significant gene >> list. > Just want to clarify that what I meant was that it might be beyond the > scope of our use case to accurately, comprehensively, and precisely define > what gene expression really mean given the degree of complexity involved. > > Cheers, > > -Kei >> >> cheers, >> michael >> >> >> ----- Original Message ----- From: "Kei Cheung" <kei.cheung@yale.edu> >> To: "mdmiller" <mdmiller53@comcast.net> >> Cc: "M. Scott Marshall" <marshall@science.uva.nl>; "HCLS" >> <public-semweb-lifesci@w3.org> >> Sent: Tuesday, May 25, 2010 8:52 PM >> Subject: Re: BioRDF Telcon >> >> >>> Hi Michael et al, >>> >>> What do we mean by differentially expressed genes? One definition is >>> that differentially expressed genes are genes with significantly >>> different expression in two samples/conditions/experimental >>> factors/dimensions (e.g., treated vs. untreated, disease vs, normal, >>> time point1 vs. time point 2) of microarray experiments. >>> >>> How to measure or infer gene expression (e.g., from mRNA) is a whole >>> complex question that may be beyond the scope of our use case. >>> >>> Cheers, >>> >>> -Kei >>> >>> mdmiller wrote: >>> >>>> hi scott, >>>> >>>> i think you, jim and lena are doing a great job moving the technical >>>> aspect of this work forward. i'm looking forward to seeing the end >>>> results. >>>> >>>> cheers, >>>> michael >>>> >>>> ----- Original Message ----- From: "M. Scott Marshall" >>>> <marshall@science.uva.nl> >>>> To: "mdmiller" <mdmiller53@comcast.net> >>>> Cc: "Kei Cheung" <kei.cheung@yale.edu>; "HCLS" >>>> <public-semweb-lifesci@w3.org> >>>> Sent: Tuesday, May 25, 2010 10:21 AM >>>> Subject: Re: BioRDF Telcon >>>> >>>> >>>>> Hi Michael, >>>>> >>>>> Thanks for the clarification. I also explained those concepts during >>>>> the BioRDF teleconference but it is difficult for the scribe to >>>>> capture such details accurately from a phone conversation. Just >>>>> knowing that a gene has changed (either up or down) already gives us >>>>> something to work with. Since we started with the microarray use case, >>>>> we have aimed to focus on the list of differentially expressed genes >>>>> as our entry point into related molecular information, phenotypes, >>>>> pathways, diseases, etc. >>>>> >>>>> In addition to the gene list and experimental factors, there is some >>>>> data provenance information that characterizes the origins of the gene >>>>> list, such as the type of significant analysis or technique that was >>>>> performed (ANOVA, LIMMA, ..) and p-value cutoff for the list discussed >>>>> in the associated article(s), software packages used (specific R >>>>> package from BioConductor, GeneSpring, NextBio, ..). It would be handy >>>>> if there was a common vocabulary for this type of information (URI's >>>>> for statistical techniques and software packages). I think that some >>>>> related resources have been described by myGrid/myExperiment. However, >>>>> lacking a complete vocabulary, it is still possible to make use of the >>>>> gene list without such a fine grained description of its provenance. >>>>> >>>>> Cheers, >>>>> Scott >>>>> >>>>> On Tue, May 25, 2010 at 9:35 AM, mdmiller <mdmiller53@comcast.net> >>>>> wrote: >>>>> >>>>>> hi all, >>>>>> >>>>>> sorry i ended up not being able to make the call. >>>>>> >>>>>> "P value >>>>>> The probability (ranging from zero to one) that the results observed >>>>>> in a >>>>>> study could have occurred by chance if the null hypothesis was true. >>>>>> A P >>>>>> value of ? 0.05 is often used as a threshold to indicate statistical >>>>>> significance." (1) >>>>>> >>>>>> the exact meaning of p-value depends on what is being measured. >>>>>> >>>>>> also, sometimes it isn't so important that a gene is up or down >>>>>> regulated >>>>>> but whether its expression changes from up or down regulated over the >>>>>> experimental factors, e.g. if you increase the dose of the drug do >>>>>> the >>>>>> target genes go from non-expressed to up regulated. >>>>>> >>>>>> cheers, >>>>>> michael >>>>>> >>>>>> 1) >>>>>> http://www.ncbi.nlm.nih.gov/bookshelf/br.fcgi?book=antiepi&part=appendixes.a pp2 >>>>>> >>>>>> ----- Original Message ----- From: "Kei Cheung" <kei.cheung@yale.edu> >>>>>> To: "HCLS" <public-semweb-lifesci@w3.org> >>>>>> Sent: Monday, May 24, 2010 11:40 AM >>>>>> Subject: Re: BioRDF Telcon >>>>>> >>>>>> >>>>>>> Today's minutes are available at: >>>>>>> >>>>>>> >>>>>>> http://esw.w3.org/HCLSIG_BioRDF_Subgroup/Meetings/2010/05-24_Conference_Call >>>>>>> >>>>>>> Thanks to Matthias for scribing. >>>>>>> >>>>>>> Cheers, >>>>>>> >>>>>>> -Kei >>>>>>> >>>>>>> mdmiller wrote: >>>>>>> >>>>>>>> >>>>>>>> hi kei, >>>>>>>> >>>>>>>> look forward to joining the call, >>>>>>>> michael >>>>>>>> >>>>>>>> ----- Original Message ----- From: "Kei Cheung" >>>>>>>> <kei.cheung@yale.edu> >>>>>>>> To: "mdmiller" <mdmiller53@comcast.net>; "HCLS" >>>>>>>> <public-semweb-lifesci@w3.org> >>>>>>>> Sent: Saturday, May 22, 2010 12:10 PM >>>>>>>> Subject: Re: BioRDF Telcon >>>>>>>> >>>>>>>> >>>>>>>>> Hi Michael, >>>>>>>>> >>>>>>>>> Yes, May 24 was what I meant. It was a typo. >>>>>>>>> >>>>>>>>> Thanks, >>>>>>>>> >>>>>>>>> -Kei >>>>>>>>> >>>>>>>>> mdmiller wrote: >>>>>>>>> >>>>>>>>>> hi kei, >>>>>>>>>> >>>>>>>>>> do you mean monday (may 24)? >>>>>>>>>> >>>>>>>>>> cheers, >>>>>>>>>> michael >>>>>>>>>> >>>>>>>>>> ----- Original Message ----- From: "Kei Cheung" >>>>>>>>>> <kei.cheung@yale.edu> >>>>>>>>>> To: "JunZhao" <jun.zhao@zoo.ox.ac.uk> >>>>>>>>>> Cc: <public-semweb-lifesci@w3.org> >>>>>>>>>> Sent: Friday, May 21, 2010 2:28 PM >>>>>>>>>> Subject: Re: BioRDF Telcon >>>>>>>>>> >>>>>>>>>> >>>>>>>>>>> Since there were only Jun and Scott who attended the last BioRDF >>>>>>>>>>> call >>>>>>>>>>> (I was not able to attend due to some emergency meetings), we >>>>>>>>>>> decided to >>>>>>>>>>> have the next BioRDF call on the coming Monday (May 21) at 11 am >>>>>>>>>>> (EDT). The >>>>>>>>>>> agenda will be the same (see below). >>>>>>>>>>> >>>>>>>>>>> Cheers, >>>>>>>>>>> >>>>>>>>>>> -Kei >>>>>>>>>>> >>>>>>>>>>> JunZhao wrote: >>>>>>>>>>> >>>>>>>>>>>> This is a reminder that the next BioRDF telcon call will be >>>>>>>>>>>> held at >>>>>>>>>>>> 11 >>>>>>>>>>>> am EDT (4 pm CET) on Monday, May 17 (see details below). >>>>>>>>>>>> >>>>>>>>>>>> Cheers, >>>>>>>>>>>> >>>>>>>>>>>> -Jun >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> == Conference Details == >>>>>>>>>>>> * Date of Call: Monday, May 17, 2010 >>>>>>>>>>>> * Time of Call: 11:00 am Eastern Time (4 pm CET) >>>>>>>>>>>> * Dial-In #: +1.617.761.6200 (Cambridge, MA) >>>>>>>>>>>> * Dial-In #: +33.4.89.06.34.99 (Nice, France) >>>>>>>>>>>> * Dial-In #: +44.117.370.6152 (Bristol, UK) >>>>>>>>>>>> * Participant Access Code: 4257 ("HCLS") >>>>>>>>>>>> * IRC Channel: irc.w3.org port 6665 channel #HCLS (see W3C IRC >>>>>>>>>>>> page >>>>>>>>>>>> for >>>>>>>>>>>> details, or see Web IRC), Quick Start: Use >>>>>>>>>>>> http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls >>>>>>>>>>>> for >>>>>>>>>>>> IRC access. >>>>>>>>>>>> * Duration: ~1 hour >>>>>>>>>>>> * Frequency: bi-weekly >>>>>>>>>>>> * Convener: Jun >>>>>>>>>>>> * Scribe: to-be-determined >>>>>>>>>>>> >>>>>>>>>>>> ==Agenda== >>>>>>>>>>>> * Introduction >>>>>>>>>>>> * Gene list RDF representation >>>>>>>>>>>> * iPhone demo >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >>>> >>> >>> >> >> >> > > >
Received on Thursday, 27 May 2010 06:52:47 UTC