Re: [BioRDF] W3C Note on expression RDF from Helena Deus on 2011-11-30 (public-semweb-lifesci@w3.org from November 2011)

From: Helena Deus <helenadeus@gmail.com>
Date: Wed, 30 Nov 2011 11:52:37 +0000
To: expressionrdf@googlegroups.com
Cc: Chris Mungall <cjmungall@lbl.gov>, "M. Scott Marshall" <mscottmarshall@gmail.com>, HCLS <public-semweb-lifesci@w3.org>
Message-ID: <CAPkJ_9kLEadh7NmKij9C_-_nCTq1twpbLTgxfpfr=E3hXzE-fA@mail.gmail.com>
This is great, Michael, Thank!

I assume the read per million have to do with normalization/QC of the data.
Any idea what the "cross map to other miRNA forms means"; by the name, I
assume these are miRNA that can target more than one gene.

Now the question is: should we blindly create an RDF representation for
reporting sequencing results such as "miRNA name", "raw read count", etc?
Or, use a more interesting approach, whereby we try to map the miRNA to the
genes that they are regulating and use the "cross map" to link each miRNA
to othre miRNA forms (in ccase the value is a Y)?

This would enable easy linking to the expression values!!

Ideas? Suggestions?
Best, Lena


On Tue, Nov 29, 2011 at 11:43 PM, Michael Miller <
Michael.Miller@systemsbiology.org> wrote:

> hi all,
>
>
>
> here's what i found as quantitation types from the TCGA MAGE files for
> next gene seq from the DESCRIPTION.txt file (from
> C:\data\nci\2011_11_28_tcga\blca\cgcc\bcgsc.ca
> \illuminahiseq_mirnase\mirnaseq\bcgsc.ca_BLCA.IlluminaHiSeq_miRNASeq.Level_3.1.0.0):
>
> The .mirna.quantification.txt  data file describing summed expression for
> each miRNA is as follows:
>
>
>
> miRNA name
>
> raw read count
>
> reads per million miRNA reads
>
> cross-mapped to other miRNA forms (Y or N)
>
>
>
> The .isoform.quantification.txt data file describing every individual
> sequence isoform observed is as follows:
>
>
>
> miRNA name
>
> alignment coordinates as <version>:<Chromosome>:<Start position>-<End
> position>:<Strand>
>
> raw read count
>
> reads per million miRNA reads
>
> cross-mapped to other miRNA forms (Y or N)
>
> region within miRNA
>
>
>
> as the URL suggests, this is for miRNA but i imagine for IlluminaHiSeq
> with any sample this would be typical.  it's likely to be true that each of
> the sequencing technologies have specific quantitation types.
>
>
>
>
>
>
>
> *From:* Chris Mungall [mailto:cjmungall@lbl.gov]
> *Sent:* Monday, November 07, 2011 7:35 AM
> *To:* M. Scott Marshall
> *Cc:* expressionrdf@googlegroups.com; HCLS
>
> *Subject:* Re: [BioRDF] W3C Note on expression RDF
>
>
>
>
>
> On Nov 7, 2011, at 2:03 PM, M. Scott Marshall wrote:
>
>
>
>  Dear BioRDF,
>
> I've pasted the minutes from our last meeting below. You can find them
> here: http://www.w3.org/2011/10/24-HCLS-minutes.html
>
> Part of the discussion that isn't available in the minutes below was
> agreement that NGS expression could be minimally supported by, for example,
> providing a placeholder for information such as quantified expression. Phil
> gave us some slides (see link to PDF below) and pointed us to slide 81.
> Analogous to representing the *results* of differential expression analysis
> in microarrays (rather than all details of images analysis, etc.), we would
> like to be able to represent the results of analyzing RNA-seq data. A few
> of you expressed interest in looking into the minimal features needed to
> represent NGS RNA-seq analysis results (Michael, Phil, others?). Please
> also feel free to continue this discussion on the mailing list.
>
>
>
> What sort of thing do you have in mind for the RNA-seq data? Would this be
> subsumed by a generic RDF representation for interval based formats like
> GFF3 and formats like wiggle?
>
>
>
> What about downstream analyses, e.g. GOseq? I'd be interested in working
> on a standard format for the results of enrichment analyses. See:
>
>
>
>
> http://biostar.stackexchange.com/questions/11269/is-there-a-standard-format-for-go-term-enrichment-results
>
>
>
> Our current thinking is to define an abstract model independent of
> serialization, and concrete forms such as json, tab-delimited and rdf.
>
>
>
> I haven't had time to trim down and restructure the google doc yet. I
> cannot make a teleconference in today's BioRDF timeslot but encourage you
> to call in if you want to continue the discussion with others that show up.
>
> Cheers,
>
> Scott
>
>
> https://docs.google.com/document/d/1A5-3tOsifPWPpETBKU-ZA9d7O7wK_nBzTFUBEe-0Bzo/edit?authkey=CK-y8Y8C
>
> http://purl.org/net/biordfmicroarray/demo
>
> http://ui.genexpressfusion.googlecode.com/hg/index.html
>
> <*Phil*> it seems currently restriction to microarray based gene
> expression
>
> Scott (retroactively scribing): Repeated goal of W3C note - i.e. to give
> people confidence in *an* RDF representation and approach. Decide when we
> go to HTML and version control. What's missing?
>
> *Scott:* See if we can minimize differences between current
> representations?
>
> *Sudeshna:* Make a new one as the standard?
>
> *Michael:* But we already have enough to work with in the current set of
> representations.
>
> *James:* I thought we were simply going to talk about some of the current
> work and how it can be used and 'cut it loose'.
>
> *Tomasz:* Ours was meant to be a 'reference point'.
>
> *Michael:* Yes, 'reference point' sounds better than 'cannonical RDF'.
>
> *James:* Some news: A student project at EBI is just coming toward the
> end. Bulk of ArrayExpress in RDF. Not public data yet.
>
> *Jim:* After using MGED in my IPAW paper, I converted to OBI. It's the
> MAGETAB2RDF work. Some issues with Limpopo.
>
> *James:* I could send you (Jim) some stuff for you to take a look at.
>
> *Michael:* Could you explain what you mean by "there are some limits to
> the translations"?
>
> *James:* Some matching of terms isn't perfect.
>
> This is the IPAW paper:
> http://www.springerlink.com/index/W10740804446172U.pdf
>
> http://krauthammerlab.med.yale.edu/~jpm78/ArrayExpress/E-AFMX-1.rdf.ttl
>
> http://swbig.googlecode.com
>
> <*james*> JM to post info on magetab 2 rdf at arrayexpress once beta is
> out - credit to Drashtti Vasant and Tony Burdett
>
> <*james*> http://code.google.com/p/open-biomed/wiki/GeneExpressionAtlas
>
> <*james*> example queries for gxa rdf
>
> <*ericP*> i can hear everything, but can't speak up to volunteer
>
> <*james*> congrats eric
>
> <*JimMcCusker*> BTW, congrats, eric!
>
> <*tomasz*> congrats, Eric!
>
> maggots have indeed been revalidated in recent years for keep wounds
> healing faster (they clean it up)
>
> *Scott:* Somebody brought up the need to deal with NGS and I agree. But
> that means more work..
>
> <*JimMcCusker*> I have to drop off for a prov WG call. If you need
> anything from me towards the paper, let me know.
>
> ok, thanks Jim
>
> <*sudeshna*> http://cufflinks.cbcb.umd.edu/
>
> <*tomasz*> James: get it out for feedback for community as soon as
> possible
>
> <*tomasz*> I second that!
>
> <*Phil*>
> http://www.bioinformatics.auckland.ac.nz/workshops/NGS-workshop-update.pdf
>
> <*Phil*> slide 81
>
> <*tomasz*> thanks, bye
>
> <*sudeshna*> bye
>
>
>
>
>



-- 
Helena F. Deus
Post-Doctoral Researcher at DERI/NUIG
http://lenadeus.info/
Received on Wednesday, 30 November 2011 11:53:28 UTC