- From: Jim McCusker <james.mccusker@yale.edu>
- Date: Wed, 6 Jan 2010 11:54:57 -0500
- To: Helena Deus <helenadeus@gmail.com>
- Cc: w3c semweb HCLS <public-semweb-lifesci@w3.org>
- Message-ID: <68084f3e1001060854h4d1a1f3dj88b66b05614c1187@mail.gmail.com>
On Wed, Jan 6, 2010 at 11:08 AM, Helena Deus <helenadeus@gmail.com> wrote: > Hi Jim, > > This is great! I noticed you already add the links both to the raw data > files and the processed data files, am I right in assuming this data comes > from the SDRF? > Yes, these are comments embedded in SDRF, and the nodes for those files are explicitly mentioned in SDRF too. > I see you intergrated the MGED ontology with the data nicelly, have you > attempted a few SPARQL queries, for example, retrieve all raw data files > from "mged:arabidopsis_thaliana"? > I haven't yet tried any SPARQL queries like that, but that was the goal of handling the Terms and Term Sources the way I did. Also, I noticed that in your ontology you don't separate each sample > hydridization raw file, probably because they are all distributed in the ftp > as a compressed folder. For example, I see that inside raw data file archive > "E-MEXP-986.raw.1.zip" there are 4 text files: > 1d1S15.txt.txt, 2d1S15.txt.txt, 2d1S22.txt.txt and 4d1S22.txt.txt. Since > it's possible to add a link from a Sample to each of these .txt files, do > you think it would be useful to add this information in the raw rdf file? > Other SDRF files may link directly to a file (the ones that I've written do), so in my mind it's a matter of GIGO. I don't currently go beyond what is in the IDF and SDRF (in other words, what's being parsed by Limpopo), and I'm trying to keep second-guessing to a minimum. One thing I hope this tool exposes is the effects of certain kinds of curation on the available data structures, and maybe some best practices can come out of it. Jim -- Jim McCusker Programmer Analyst Krauthammer Lab, Pathology Informatics Yale School of Medicine james.mccusker@yale.edu | (203) 785-6330 http://krauthammerlab.med.yale.edu PhD Student Tetherless World Constellation Rensselaer Polytechnic Institute mccusj@cs.rpi.edu http://tw.rpi.edu
Received on Wednesday, 6 January 2010 16:55:52 UTC