Re: BioRDF Telcon from Kei Cheung on 2009-12-02 (public-semweb-lifesci@w3.org from December 2009)

From: Kei Cheung <kei.cheung@yale.edu>
Date: Tue, 01 Dec 2009 22:20:35 -0500
To: Helena Deus <helenadeus@gmail.com>
CC: mdmiller <mdmiller53@comcast.net>, HCLS <public-semweb-lifesci@w3.org>
Message-ID: <4B15DD03.8020708@yale.edu>
Hi Lena,

Helena Deus wrote:

> @Kei,
>
>     When you said data structure, did you mean the RDF structure
>
>
> For now, all I have is the java object returned by parser. I've been 
> using Limpopo, which creates an object that I can then parse to RDF 
> uing Jena. The challenge, though, has been coming up with the 
> predicates to formalize the relationships between the various 
> elements. I'm using the XML structures fir IDF/SDRF etc. 
> at http://magetab-om.sourceforge.net to automatically generate the 
> structure that will contain the data. My plan is to then create the 
> RDF triples that use the attributes described in those documents and 
> populate them with the data from the MAGE-TAB java object created by 
> Limpopo.


Thanks for the pointer and explaining your strategy. We might not need 
to convert everything from mage-tab for our purposes.

>
> Right now all I have is a very raw RDF/XML document describing the 
> relationships in the IDF 
> structure: http://magetab2rdf.googlecode.com/svn/trunk/magetabpredicates.rdf
> The triples for that had to be encoded manually using Jena by reading 
> the model.

I think IDF is a good start. For a real example for our use case, I 
wonder if any mage-tab file is available for experiment E-GEOD-4757 
(transcription profiling of human neurons with and without 
neurofibriallary tangles from Alzheimer's patients). Helen may know.

Cheers,

-Kei

>
> @Satya and Jun
>
> I would very much like to be involved in that effort, do you already 
> have a URL that I can look at?
>
> Thanks
> Lena
>
> On Tue, Nov 24, 2009 at 2:19 PM, Kei Cheung <kei.cheung@yale.edu 
> <mailto:kei.cheung@yale.edu>> wrote:
>
>     Hi Lena et al,
>
>     When you said data structure, did you mean the RDF structure. If
>     so, is a pointer to the structure that we can look at?
>
>     As discussed during yesterday's call, Jun and Satya will help
>     create a wiki page for listing some of the requirements for
>     provenance/workflow in the context of gene lists, perhaps we
>     should also use it to help coordinate some of the future
>     activities (people also brought up Taverna during the call
>     yesterday). Please coordinate with Satya and Jun.
>
>     Cheers,
>
>     -Kei
>
>     Helena Deus wrote:
>
>         Hi all,
>
>         I apologize for missing the call yesterday! It seems you had a
>         pretty interesting discussion! :-)
>         If I understand Michael's statement, parsing the
>         MAGE-TAB/MAGE-ML into RDF would result in obtaining only the
>         raw and processed data files but not the mechanism used to
>         process it nor the resulting gene list. That's also what I
>         concluded after looking at the data structure created by Tony
>         Burdett's Limpopo parser. However, having the raw data as
>         linked data is already a great start! Kei, should I be looking
>         into Taverna in order to reprocessed the raw files with a
>         traceable analysis workflow?
>
>         Thanks!
>         Lena
>
>
>
>         On Tue, Nov 24, 2009 at 9:59 AM, mdmiller
>         <mdmiller53@comcast.net <mailto:mdmiller53@comcast.net>
>         <mailto:mdmiller53@comcast.net
>         <mailto:mdmiller53@comcast.net>>> wrote:
>
>            hi all,
>
>            (from the minutes)
>
>            "Yolanda/Kei/Scott: semantic annotation/description of workflow
>            would enable the retrieval of data relevant to that
>         workflow (i.e.
>            data that could be used to populate that workflow for a
>         different
>            experimental scenario)"
>
>            what is typically in a MAGE-TAB/MAGE-ML document are the
>         protocols
>            for how the source was processed into the extract then how the
>            hybridization, feature extraction, error and normalization were
>            performed.  these are interesting and different protocols can
>            cause differences at this level but it is pretty much a
>         known art
>            and usually not of too much interest or variability.
>
>            what is usually missing from those documents, along with
>         the final
>            gene list, is how that gene list was obtained, what higher
>         level
>            analysis was used, that is generally only in the paper
>         unfortunately.
>
>            cheers,
>            michael
>            .
>            ----- Original Message ----- From: "Kei Cheung"
>            <kei.cheung@yale.edu <mailto:kei.cheung@yale.edu>
>         <mailto:kei.cheung@yale.edu <mailto:kei.cheung@yale.edu>>>
>
>            To: "HCLS" <public-semweb-lifesci@w3.org
>         <mailto:public-semweb-lifesci@w3.org>
>            <mailto:public-semweb-lifesci@w3.org
>         <mailto:public-semweb-lifesci@w3.org>>>
>
>            Sent: Monday, November 23, 2009 1:27 PM
>            Subject: Re: BioRDF Telcon
>
>
>
>                Today's BioRDF minutes are available at the following:
>
>              
>          http://esw.w3.org/topic/HCLSIG_BioRDF_Subgroup/Meetings/2009/11-23_Conference_Call
>
>                Thanks to Rob for scribing.
>
>                Cheers,
>
>                -Kei
>
>                Kei Cheung wrote:
>
>                    This is a reminder that the next BioRDF telcon call
>         will
>                    be held at 11 am EDT (5 pm CET) on Monday, November 23
>                    (see details below).
>
>                    Cheers,
>
>                    -Kei
>
>                    == Conference Details ==
>                    * Date of Call: Monday November 23, 2009
>                    * Time of Call: 11:00 am Eastern Time
>                    * Dial-In #: +1.617.761.6200 (Cambridge, MA)
>                    * Dial-In #: +33.4.89.06.34.99 (Nice, France)
>                    * Dial-In #: +44.117.370.6152 (Bristol, UK)
>                    * Participant Access Code: 4257 ("HCLS")
>                    * IRC Channel: irc.w3.org <http://irc.w3.org>
>         <http://irc.w3.org> port 6665
>
>                    channel #HCLS (see W3C IRC page for details, or see Web
>                    IRC), Quick Start: Use
>                  
>          http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls
>         <http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls>
>                  
>          <http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls
>         <http://www.mibbit.com/chat/?server=irc.w3.org:6665&channel=%23hcls>>
>                    for IRC access.
>                    * Duration: ~1 hour
>                    * Frequency: bi-weekly
>                    * Convener: Kei Cheung
>                    * Scribe: to-be-determined
>
>                    == Agenda ==
>                    * Roll call & introduction (Kei)
>                    * RDF representation of microarray experiment and
>         data (All)
>                    * Provenance and workflow (All)
>
>
>
>
>
>
>
>
>
>
>
>
>
Received on Wednesday, 2 December 2009 03:21:07 UTC