- From: Michael Miller <Michael.Miller@systemsbiology.org>
- Date: Tue, 29 Nov 2011 15:43:44 -0800
- To: Chris Mungall <cjmungall@lbl.gov>, "M. Scott Marshall" <mscottmarshall@gmail.com>
- Cc: expressionrdf@googlegroups.com, HCLS <public-semweb-lifesci@w3.org>
- Message-ID: <aa6b9014a4b4e8452f15d7268c7ba6e4@mail.gmail.com>
hi all, here's what i found as quantitation types from the TCGA MAGE files for next gene seq from the DESCRIPTION.txt file (from C:\data\nci\2011_11_28_tcga\blca\cgcc\bcgsc.ca \illuminahiseq_mirnase\mirnaseq\bcgsc.ca_BLCA.IlluminaHiSeq_miRNASeq.Level_3.1.0.0): The .mirna.quantification.txt data file describing summed expression for each miRNA is as follows: miRNA name raw read count reads per million miRNA reads cross-mapped to other miRNA forms (Y or N) The .isoform.quantification.txt data file describing every individual sequence isoform observed is as follows: miRNA name alignment coordinates as <version>:<Chromosome>:<Start position>-<End position>:<Strand> raw read count reads per million miRNA reads cross-mapped to other miRNA forms (Y or N) region within miRNA as the URL suggests, this is for miRNA but i imagine for IlluminaHiSeq with any sample this would be typical. it's likely to be true that each of the sequencing technologies have specific quantitation types. *From:* Chris Mungall [mailto:cjmungall@lbl.gov] *Sent:* Monday, November 07, 2011 7:35 AM *To:* M. Scott Marshall *Cc:* expressionrdf@googlegroups.com; HCLS *Subject:* Re: [BioRDF] W3C Note on expression RDF On Nov 7, 2011, at 2:03 PM, M. Scott Marshall wrote: Dear BioRDF, I've pasted the minutes from our last meeting below. You can find them here: http://www.w3.org/2011/10/24-HCLS-minutes.html Part of the discussion that isn't available in the minutes below was agreement that NGS expression could be minimally supported by, for example, providing a placeholder for information such as quantified expression. Phil gave us some slides (see link to PDF below) and pointed us to slide 81. Analogous to representing the *results* of differential expression analysis in microarrays (rather than all details of images analysis, etc.), we would like to be able to represent the results of analyzing RNA-seq data. A few of you expressed interest in looking into the minimal features needed to represent NGS RNA-seq analysis results (Michael, Phil, others?). Please also feel free to continue this discussion on the mailing list. What sort of thing do you have in mind for the RNA-seq data? Would this be subsumed by a generic RDF representation for interval based formats like GFF3 and formats like wiggle? What about downstream analyses, e.g. GOseq? I'd be interested in working on a standard format for the results of enrichment analyses. See: http://biostar.stackexchange.com/questions/11269/is-there-a-standard-format-for-go-term-enrichment-results Our current thinking is to define an abstract model independent of serialization, and concrete forms such as json, tab-delimited and rdf. I haven't had time to trim down and restructure the google doc yet. I cannot make a teleconference in today's BioRDF timeslot but encourage you to call in if you want to continue the discussion with others that show up. Cheers, Scott https://docs.google.com/document/d/1A5-3tOsifPWPpETBKU-ZA9d7O7wK_nBzTFUBEe-0Bzo/edit?authkey=CK-y8Y8C http://purl.org/net/biordfmicroarray/demo http://ui.genexpressfusion.googlecode.com/hg/index.html <*Phil*> it seems currently restriction to microarray based gene expression Scott (retroactively scribing): Repeated goal of W3C note - i.e. to give people confidence in *an* RDF representation and approach. Decide when we go to HTML and version control. What's missing? *Scott:* See if we can minimize differences between current representations? *Sudeshna:* Make a new one as the standard? *Michael:* But we already have enough to work with in the current set of representations. *James:* I thought we were simply going to talk about some of the current work and how it can be used and 'cut it loose'. *Tomasz:* Ours was meant to be a 'reference point'. *Michael:* Yes, 'reference point' sounds better than 'cannonical RDF'. *James:* Some news: A student project at EBI is just coming toward the end. Bulk of ArrayExpress in RDF. Not public data yet. *Jim:* After using MGED in my IPAW paper, I converted to OBI. It's the MAGETAB2RDF work. Some issues with Limpopo. *James:* I could send you (Jim) some stuff for you to take a look at. *Michael:* Could you explain what you mean by "there are some limits to the translations"? *James:* Some matching of terms isn't perfect. This is the IPAW paper: http://www.springerlink.com/index/W10740804446172U.pdf http://krauthammerlab.med.yale.edu/~jpm78/ArrayExpress/E-AFMX-1.rdf.ttl http://swbig.googlecode.com <*james*> JM to post info on magetab 2 rdf at arrayexpress once beta is out - credit to Drashtti Vasant and Tony Burdett <*james*> http://code.google.com/p/open-biomed/wiki/GeneExpressionAtlas <*james*> example queries for gxa rdf <*ericP*> i can hear everything, but can't speak up to volunteer <*james*> congrats eric <*JimMcCusker*> BTW, congrats, eric! <*tomasz*> congrats, Eric! maggots have indeed been revalidated in recent years for keep wounds healing faster (they clean it up) *Scott:* Somebody brought up the need to deal with NGS and I agree. But that means more work.. <*JimMcCusker*> I have to drop off for a prov WG call. If you need anything from me towards the paper, let me know. ok, thanks Jim <*sudeshna*> http://cufflinks.cbcb.umd.edu/ <*tomasz*> James: get it out for feedback for community as soon as possible <*tomasz*> I second that! <*Phil*> http://www.bioinformatics.auckland.ac.nz/workshops/NGS-workshop-update.pdf <*Phil*> slide 81 <*tomasz*> thanks, bye <*sudeshna*> bye
Received on Tuesday, 29 November 2011 23:44:26 UTC