- From: Michael Miller <mmiller@teranode.com>
- Date: Wed, 6 Jan 2010 11:55:31 -0500
- To: "Helena Deus" <helenadeus@gmail.com>, "Jim McCusker" <james.mccusker@yale.edu>
- Cc: "w3c semweb HCLS" <public-semweb-lifesci@w3.org>
- Message-ID: <6401DB16544A5B4AA279B921F43547EC066FCB2B@MI8NYCMAIL16.Mi8.com>
hi helena and jim, this is what i see in E-MEXP-986.rdf which seems a nice way to capture this information: ... <j.0:Scan rdf:about=".#scanname/ebi.ac.uk:MIAMExpress:Hybridization:24902"> <j.0:has_derivative> <j.0:ArrayDataMatrix rdf:about=".#arraydatamatrixfile/E-MEXP-986-raw-data-1321832734.txt"> <j.0:has_comment> <j.0:Comment rdf:about=".#arraydatamatrixfile/E-MEXP-986-raw-data-1321832734.txt/comm ents/1"> <j.1:has_value rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >ftp://ftp.ebi.ac.uk/pub/databases/microarray/data/experiment/MEXP/E-MEX P-986/E-MEXP-986.raw.zip</j.1:has_value> <j.1:has_name rdf:datatype="http://www.w3.org/2001/XMLSchema#string" >ArrayExpress FTP file</j.1:has_name> ... it has the name of the file then a reference to the name of the ArrayExpress FTP file. but the problem seems to be that ArrayDataMatrix is referencing the 'Derived Array Data Matrix File' column (which is different than a 'Array Data Matrix File' column) when it should be referencing the 'Array Data File' column, there is a nested DerivedArrayDataMatrix element which does look correct. above looks like E-MEXP-986-raw-data-1321832734 should be an ArrayDataFile element with a value of '2d1S15.txt.txt', not sure how the name of the file was gotten, there is no mention in the SDRF of a file of that name but it is similar in name to the file in the Derived Array Data Matrix file. There is no mention of '2d1S15.txt.txt', the correct name, anywhere in the file. plus there also seems to be unnecessary duplication, i.e. there's a nested repeat of ArrayDataMatrix and DerivedArrayDataMatrix elements? but this might be an artifice of XML RDF? does XML RDF allow referencing an element that is fully defined elsewhere? that would make things a lot clearer and concise. but this is a great start. cheers, michael From: public-semweb-lifesci-request@w3.org [mailto:public-semweb-lifesci-request@w3.org] On Behalf Of Helena Deus Sent: Wednesday, January 06, 2010 8:09 AM To: Jim McCusker Cc: w3c semweb HCLS Subject: Re: magetab2magerdf Hi Jim, This is great! I noticed you already add the links both to the raw data files and the processed data files, am I right in assuming this data comes from the SDRF? I see you intergrated the MGED ontology with the data nicelly, have you attempted a few SPARQL queries, for example, retrieve all raw data files from "mged:arabidopsis_thaliana"? Also, I noticed that in your ontology you don't separate each sample hydridization raw file, probably because they are all distributed in the ftp as a compressed folder. For example, I see that inside raw data file archive "E-MEXP-986.raw.1.zip" there are 4 text files: 1d1S15.txt.txt, 2d1S15.txt.txt, 2d1S22.txt.txt and 4d1S22.txt.txt. Since it's possible to add a link from a Sample to each of these .txt files, do you think it would be useful to add this information in the raw rdf file? Thanks! Lena On Tue, Dec 8, 2009 at 8:05 AM, Jim McCusker <james.mccusker@yale.edu> wrote: I'm distinguishing between magetab2rdf (raw conversion of magetab into an RDF structure) and magetab2magerdf (conversion of magetab into an RDF-based MAGE-OM structure) here. My purposes and goals require a magetab2magerdf approach, so that's what I've been working on. I have checked in code for magetab2magerdf at the googlecode project http://magetab2rdf.googlecode.com. The code can be checked out from: http://magetab2rdf.googlecode.com/svn/trunk/magetab2magerdf/ and example RDF is in: http://magetab2rdf.googlecode.com/svn/trunk/magetab2magerdf/examples/E-M EXP-986/ I currently load the IDF-related entities into the RDF. I'm beginning work on SDRF next. http://magetab2rdf.googlecode.com/svn/trunk/ontologies/mage-om.owl contains the additional properties and classes needed to support an RDF-based MAGE-OM on top of the MGED Ontology. A few notes on E-MEXP-986: The URI for the MGED Ontology is http://mged.sourceforge.net/ontologies/MGEDontology.owl, but has been set to http://mged.sourceforge.net/ontologies/MGEDontology.php in the IDF. The actual Term Source name is "The MGED Ontology". A common practice seems to be to refer to "MGED Ontology" without reference to its URI. Since I have to import the MGED ontology already for it's classes and properties, I have already imported it under the correct URI. I have added a kludge where if the term source name contains the string "MGED Ontology", the code assumes you mean the MGED Ontology, and sets the URI appropriately. However, this is a one-off solution. I went back and forth about importing the Term Source ontologies. However, this particular experiment has used the "ArrayExpress" term source using the URI "http://www.ebi.ac.uk/arrayexpress/" which doesn't correspond to an available ontology, but is technically a term source. I'm considering attempting to import the ontology if it's available and validate if it is, but if it fails to resolve to a document the validation will not happen against that term source. A note on Limpopo: The IDF Comment didn't seem to import on this experiment. I'm not sure if it's a format problem or something else. Thoughts and feedback are greatly appreciated. Jim -- Jim McCusker Programmer Analyst Krauthammer Lab, Pathology Informatics Yale School of Medicine james.mccusker@yale.edu | (203) 785-6330 http://krauthammerlab.med.yale.edu PhD Student Tetherless World Constellation Rensselaer Polytechnic Institute mccusj@cs.rpi.edu http://tw.rpi.edu
Received on Wednesday, 6 January 2010 16:56:04 UTC