Re: VCF and RDF, at Clinical Pharmacogenomics TF, Wed Apr 3rd

On 2013-04-01, at 3:40 PM, Chris Mungall <cjmungall@lbl.gov> wrote:
> VCF is convertible to GVF[4,5] which is a subset of GFF3 with additional recommended metadata. It's supported by Ensembl, gbGap and others, and the 1000genomes data is available in GVF[6].
> 
> As GFF3 is convertible to RDF/OWL that uses FALDO and SO, it follows that GVF is too (though the converter may need tweaking to take advantage of the additional GVF metadata).
  GVF is convertible to RDF/OWL too now. Based on the GFF3-to-OWL converter, there are two ontologies and a Ruby gem for turning GFF3/GVF into RDF.

  Have a look at these ontologies: http://www.biointerchange.org/ontologies.html

  A manuscript about GFF3O/GVF1O has been freshly rejected (mostly because PLOS ONE does not appear to take technical papers), but I plan to resubmit a much shorter version somewhere else. The manuscript contains a complete workflow example from rewriting GVF into RDF via BioInterchange, verifying the RDF against GVF1O using HermiT, and loading/querying it in Sesame. If there is interest, then I can share to rejected manuscript with individuals after some chitchat, but since it is going to be rewritten and resubmitted, I will not share it publicly at this stage.

  Of course, GFF3O/GVF1O also make use of FALDO. Expressing genomic coordinates using FALDO is optional though, where BioInterchange currently uses the alternative "build in" GFF3O/GVF1O solution when rewriting GFF3/GVF files.

Joachim

Received on Monday, 1 April 2013 20:34:52 UTC