Introduction: Jeremy J. Carroll

While it doesn't seem to be the convention in this group, now that I have graduated from lurking to participating, I thought I should say why I am here.

I have a new job with a genomics company in Silicon Valley, and my remit is to work out how best to represent both scientific knowledge about genomics, and also the per-patient knowledge found in say a VCF file ….

Having been working in SemWeb for over a decade now (RDF Concepts, OWL Tests, Jena, Named Graphs, TopBraid Suite), my bias is to use SemWeb technologies; but, particularly with the variant information, the back-of-an-envelope numbers look fairly large, see:

http://lists.w3.org/Archives/Public/semantic-web/2013Mar/0048.html

(That's 1B triples for science, and 1B triples per vcf file uploaded)

The Clinical Pharmacogenomics TF seems like the most relevant part, unfortunately the 7.15 - 8.15 slot for the call coincides with some parental duties for me …

I am still on the look out for some documentation by people who have been tackling similar problems.

As far as I can tell, many people in the field are bought into an ontological approach, but the penetration of even pretty basic SemWeb good practices is low: e.g. use of URIs for concepts, RDF and OWL … and interoperating with various large datasets (e.g. dbsnp) is necessary, but most are not natively OWL

Any thoughts or suggestions would be very welcome

Jeremy

Received on Friday, 15 March 2013 21:34:56 UTC