RE: [BioRDF] W3C Note on expression RDF

hi all,



here's what i found as quantitation types from the TCGA MAGE files for next
gene seq from the DESCRIPTION.txt file (from
C:\data\nci\2011_11_28_tcga\blca\cgcc\bcgsc.ca
\illuminahiseq_mirnase\mirnaseq\bcgsc.ca_BLCA.IlluminaHiSeq_miRNASeq.Level_3.1.0.0):

The .mirna.quantification.txt  data file describing summed expression for
each miRNA is as follows:



miRNA name

raw read count

reads per million miRNA reads

cross-mapped to other miRNA forms (Y or N)



The .isoform.quantification.txt data file describing every individual
sequence isoform observed is as follows:



miRNA name

alignment coordinates as <version>:<Chromosome>:<Start position>-<End
position>:<Strand>

raw read count

reads per million miRNA reads

cross-mapped to other miRNA forms (Y or N)

region within miRNA



as the URL suggests, this is for miRNA but i imagine for IlluminaHiSeq with
any sample this would be typical.  it's likely to be true that each of the
sequencing technologies have specific quantitation types.







*From:* Chris Mungall [mailto:cjmungall@lbl.gov]
*Sent:* Monday, November 07, 2011 7:35 AM
*To:* M. Scott Marshall
*Cc:* expressionrdf@googlegroups.com; HCLS
*Subject:* Re: [BioRDF] W3C Note on expression RDF





On Nov 7, 2011, at 2:03 PM, M. Scott Marshall wrote:



Dear BioRDF,

I've pasted the minutes from our last meeting below. You can find them
here: http://www.w3.org/2011/10/24-HCLS-minutes.html

Part of the discussion that isn't available in the minutes below was
agreement that NGS expression could be minimally supported by, for example,
providing a placeholder for information such as quantified expression. Phil
gave us some slides (see link to PDF below) and pointed us to slide 81.
Analogous to representing the *results* of differential expression analysis
in microarrays (rather than all details of images analysis, etc.), we would
like to be able to represent the results of analyzing RNA-seq data. A few
of you expressed interest in looking into the minimal features needed to
represent NGS RNA-seq analysis results (Michael, Phil, others?). Please
also feel free to continue this discussion on the mailing list.



What sort of thing do you have in mind for the RNA-seq data? Would this be
subsumed by a generic RDF representation for interval based formats like
GFF3 and formats like wiggle?



What about downstream analyses, e.g. GOseq? I'd be interested in working on
a standard format for the results of enrichment analyses. See:




http://biostar.stackexchange.com/questions/11269/is-there-a-standard-format-for-go-term-enrichment-results



Our current thinking is to define an abstract model independent of
serialization, and concrete forms such as json, tab-delimited and rdf.



I haven't had time to trim down and restructure the google doc yet. I
cannot make a teleconference in today's BioRDF timeslot but encourage you
to call in if you want to continue the discussion with others that show up.

Cheers,

Scott

https://docs.google.com/document/d/1A5-3tOsifPWPpETBKU-ZA9d7O7wK_nBzTFUBEe-0Bzo/edit?authkey=CK-y8Y8C

http://purl.org/net/biordfmicroarray/demo

http://ui.genexpressfusion.googlecode.com/hg/index.html

<*Phil*> it seems currently restriction to microarray based gene expression

Scott (retroactively scribing): Repeated goal of W3C note - i.e. to give
people confidence in *an* RDF representation and approach. Decide when we
go to HTML and version control. What's missing?

*Scott:* See if we can minimize differences between current representations?

*Sudeshna:* Make a new one as the standard?

*Michael:* But we already have enough to work with in the current set of
representations.

*James:* I thought we were simply going to talk about some of the current
work and how it can be used and 'cut it loose'.

*Tomasz:* Ours was meant to be a 'reference point'.

*Michael:* Yes, 'reference point' sounds better than 'cannonical RDF'.

*James:* Some news: A student project at EBI is just coming toward the end.
Bulk of ArrayExpress in RDF. Not public data yet.

*Jim:* After using MGED in my IPAW paper, I converted to OBI. It's the
MAGETAB2RDF work. Some issues with Limpopo.

*James:* I could send you (Jim) some stuff for you to take a look at.

*Michael:* Could you explain what you mean by "there are some limits to the
translations"?

*James:* Some matching of terms isn't perfect.

This is the IPAW paper:
http://www.springerlink.com/index/W10740804446172U.pdf

http://krauthammerlab.med.yale.edu/~jpm78/ArrayExpress/E-AFMX-1.rdf.ttl

http://swbig.googlecode.com

<*james*> JM to post info on magetab 2 rdf at arrayexpress once beta is out
- credit to Drashtti Vasant and Tony Burdett

<*james*> http://code.google.com/p/open-biomed/wiki/GeneExpressionAtlas

<*james*> example queries for gxa rdf

<*ericP*> i can hear everything, but can't speak up to volunteer

<*james*> congrats eric

<*JimMcCusker*> BTW, congrats, eric!

<*tomasz*> congrats, Eric!

maggots have indeed been revalidated in recent years for keep wounds
healing faster (they clean it up)

*Scott:* Somebody brought up the need to deal with NGS and I agree. But
that means more work..

<*JimMcCusker*> I have to drop off for a prov WG call. If you need anything
from me towards the paper, let me know.

ok, thanks Jim

<*sudeshna*> http://cufflinks.cbcb.umd.edu/

<*tomasz*> James: get it out for feedback for community as soon as possible

<*tomasz*> I second that!

<*Phil*>
http://www.bioinformatics.auckland.ac.nz/workshops/NGS-workshop-update.pdf

<*Phil*> slide 81

<*tomasz*> thanks, bye

<*sudeshna*> bye

Received on Tuesday, 29 November 2011 23:44:26 UTC